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Abstract 

Temporal  logic  is  two-valued:  formulas  are  interpreted  as  either  true  or  false. 
When  applied  to  the  analysis  of  stochastic  systems,  or  systems  with  imprecise  for¬ 
mal  models,  temporal  logic  is  therefore  fragile:  even  small  changes  in  the  model  can 
lead  to  opposite  truth  values  for  a  specification.  We  present  a  generalization  of  the 
branching-time  logic  Ctl  which  achieves  robustness  with  respect  to  model  pertur¬ 
bations  by  giving  a  quantitative  interpretation  to  predicates  and  logical  operators, 
and  by  discounting  the  importance  of  events  according  to  how  late  they  occur.  In 
every  state,  the  value  of  a  formula  is  a  real  number  in  the  interval  [0,1],  where  1 
corresponds  to  truth  and  0  to  falsehood.  The  boolean  operators  and  and  or  are 
replaced  by  min  and  max,  the  path  quantifiers  3  and  V  determine  sup  and  inf  over 
all  paths  from  a  given  state,  and  the  temporal  operators  O  and  □  specify  sup  and 
inf  over  a  given  path;  a  new  operator  averages  all  values  along  a  path.  Furthermore, 
all  path  operators  are  discounted  by  a  parameter  that  can  be  chosen  to  give  more 
weight  to  states  that  are  closer  to  the  beginning  of  the  path. 

We  interpret  the  resulting  logic  Dctl  over  transition  systems,  Markov  chains,  and 
Markov  decision  processes.  We  present  two  semantics  for  Dctl:  a  path  semantics, 
inspired  by  the  standard  interpretation  of  state  and  path  formulas  in  Ctl,  and  a 
fixpoint  semantics,  inspired  by  the  /r-calculus  evaluation  of  Ctl  formulas.  We  show 
that,  while  these  semantics  coincide  for  Ctl,  they  differ  for  Dctl,  and  we  provide 
model-checking  algorithms  for  both  semantics. 
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1  Introduction 


Boolean  state-transition  models  are  useful  for  the  representation  and  verifi¬ 
cation  of  computational  systems,  such  as  hardware  and  software  systems.  A 
boolean  state-transition  model  is  a  labeled  directed  graph,  whose  vertices  rep¬ 
resent  system  states,  whose  edges  represent  state  changes,  and  whose  labels 
represent  boolean  observations  about  the  system,  such  as  the  truth  values 
of  state  predicates.  Behavioral  properties  of  boolean  state-transition  systems 
can  be  specified  in  temporal  logic  [19,4]  and  verified  using  model-checking 
algorithms  [4], 

For  representing  systems  that  are  not  purely  computational  but  partly  physi¬ 
cal,  such  as  hardware  and  software  that  interact  with  a  physical  environment, 
boolean  state-transition  models  are  often  inadequate.  Many  quantitative  ex¬ 
tensions  of  state-transition  models  have  been  proposed  for  this  purpose,  such 
as  models  that  embed  state  changes  into  the  real  time  line,  and  models  that 
assign  probabilities  to  state  changes.  These  models  typically  contain  real  num¬ 
bers,  e.g.,  for  representing  time  or  probabilities.  Yet  previous  research  has 
focused  mostly  on  purely  boolean  frameworks  for  the  specification  and  veri¬ 
fication  of  quantitative  state-transition  models,  where  observations  are  truth 
values  of  state  predicates,  and  behavioral  properties  are  based  on  such  boolean 
observations  [13,3,1,17].  These  boolean  specification  frameworks  are  fragile 
with  respect  to  imprecisions  in  the  model:  even  arbitrarily  small  changes  in  a 
quantitative  model  can  cause  different  truth  values  for  the  specification. 

We  submit  that  a  proper  framework  for  the  specification  and  verification  of 
quantitative  state-transition  models  should  itself  be  quantitative.  To  start 
with,  we  consider  observations  that  do  not  have  boolean  truth  values,  but 
real  values  [16].  Using  these  quantitative  observations,  we  build  a  temporal 
logic  for  specifying  quantitative  temporal  properties.  A  CTL-likc  temporal 
logic  has  three  kinds  of  operators.  The  first  kind  are  boolean  operators  such 
as  “and”  and  “or”  for  locally  combining  the  truth  values  of  boolean  observa¬ 
tions.  These  are  replaced  by  “min”  and  “max”  operators  for  combining  the 
real  values  of  quantitative  observations.  In  addition,  a  “weighted  average” 
(®c)  operator  computes  a  convex  combination  of  two  quantitative  observa¬ 
tions.  The  second  kind  of  construct  are  modal  operators  such  as  “always” 
(□)  and  “eventually”  (O)  for  temporally  combining  the  truth  values  of  all 
boolean  observations  along  an  infinite  path.  These  are  replaced  by  “inf”  ( “lim 
min”)  and  “sup”  (“lim  max”)  operators  over  infinite  sequences  of  real  val¬ 
ues.  We  introduce  a  “lim  avg”  (A)  operator  that  captures  the  long-run  aver¬ 
age  value  of  a  quantitative  observation.  For  non  deterministic  models,  where 
there  is  a  choice  of  future  behaviors,  there  is  a  third  kind  of  construct:  the 
path  quantifiers  “for-all-possiblc-futures”  (V)  and  “for-some-possible-future” 
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(3)  turn  path  properties  into  state  properties  by  quantifying  over  the  paths 
from  a  given  state.  These  are  replaced  by  “inf-over-all-possible-futures”  and 
“sup-over-all-possible-futnres.”  Once  boolean  specifications  are  replaced  by 
quantitative  specifications,  it  becomes  possible  to  discount  the  future,  that  is, 
to  give  more  weight  to  the  near  future  than  to  the  far  away  future.  This  prin¬ 
ciple  is  well- understood  in  economics  and  in  the  theory  of  optimal  control  [2], 
but  is  equally  natural  in  studying  quantitative  temporal  properties  of  systems 
[10].  We  call  the  resulting  logic  Dctl  (“Discounted  Ctl”).  While  quantitative 
versions  of  dynamic  logics  [16],  //-calculi  [14,20,21,10],  and  Hennessy-Milner 
logics  [11]  exist,  Dctl  is  the  first  temporal  logic  in  which  the  non-local  tem¬ 
poral  operators  O  and  □,  along  with  the  new  temporal  operator  A  and  the 
path  quantifiers  V  and  3,  are  given  a  quantitative  interpretation. 

We  propose  two  semantics  for  Dctl:  a  path  semantics  and  a  fixpoint  seman¬ 
tics.  The  path  semantics  is  defined  as  follows.  For  a  discount  factor  a  <  1, 
the  Oa  (resp.  □„)  operator  computes  the  sup  (resp.  inf)  over  a  path,  weigh¬ 
ing  the  value  of  a  state  that  occurs  k  steps  in  the  future  by  a  factor  ak.  As 
usual,  the  operators  and  are  one  the  dual  of  the  other.  The  Aa  op¬ 
erator  computes  the  discounted  long-run  average  of  the  values  along  a  path 
(see,  e.g.,  [2]),  where  the  value  of  a  state  that  occurs  k  steps  in  the  future  is 
again  multiplied  by  a  factor  ak\  the  A„  operator  is  self-dual.  The  V  and  3 
operators  then  combine  these  values  over  the  paths:  in  transition  systems,  V 
and  3  associate  with  each  state  the  inf  and  sup  of  the  values  for  the  paths 
that  leave  the  state;  in  probabilistic  systems,  V  and  3  associate  with  each  state 
the  least  and  greatest  expectation  of  the  value  for  those  paths  (for  Markov 
chains,  there  is  a  single  expected  value  at  each  state,  but  for  Markov  decision 
processes,  the  least  and  greatest  expected  value  are  generally  different).  Thus, 
the  path  semantics  of  Dctl  is  obtained  by  lifting  to  a  quantitative  setting  the 
classical  interpretation  of  path  and  state  formulas  in  Ctl. 

The  fixpoint  semantics  is  obtained  by  lifting  to  a  quantitative  setting  the 
connection  between  Ctl  and  the  //-calculus  [4],  In  a  transition  system,  given 
a  set  r  of  states,  denote  by  3Pre(r)  the  set  of  all  states  that  have  a  one-step 
transition  to  r.  Then,  the  semantics  of  3 Or  for  a  set  r  of  states  can  be  defined 
as  the  least  fixpoint  of  the  equation  x  =  rU3Pre(a;),  denoted  px.(r  U3Pre(a;)). 
We  lift  this  definition  to  a  quantitative,  discounted  setting  by  interpreting 
U  as  pointwise  maximum,  and  3Pre(x)  as  the  maximal  expected  value  of  x 
achievable  in  one  step  [10].  For  a  discount  factor  a  <  1,  the  semantics  30Qr 
is  obtained  by  multiplying  the  next-step  expectation  with  a ,  i.e.,  px.(r  U  a  ■ 
3Pre(x)). 

The  path  and  fixpoint  semantics  coincide  on  transition  systems,  but  differ 
on  Markov  chains  (and  consequently  on  Markov  decision  processes).  This  is 
illustrated  by  the  Markov  chain  in  Figure  1.  Consider  the  Dctl  formula  0: 
30aq,  for  a  =  0.8.  According  to  the  path  semantics,  there  are  two  paths  from 
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q  =  o 


Fig.  1.  A  Markov  chain  illustrating  the  difference  between  path  and  fixpoint  semantics. 
The  states  are  labeled  with  the  values  taken  by  observation  q. 

so,  each  followed  with  probability  1/2:  the  first  path  has  the  discounted  sup 
equal  to  0.8,  and  the  second  has  the  discounted  sup  equal  to  0.2;  hence,  0 
has  the  value  (0.8  +  0.2)/2  =  0.5  at  s0.  According  to  the  fixpoint  semantics, 
q  U  0.8  ■  3Pre(g)  has  the  value  max{0.2,  0.8  •  (1  +  0)/2) }  =  0.4  at  s0,  and  this 
is  also  the  value  of  0  at  so- 

To  highlight  the  different  perspective  taken  by  the  two  semantics,  consider  a 
water  tank,  and  assume  that  q  represents  the  daily  level  of  water  in  the  tank 
(0  is  empty,  1  is  full).  Consider  the  formula  30g. 

Setting  aside  the  discounting  aspect,  in  the  fixpoint  semantics  30q  is  the 
expected  value  of  the  amount  of  money  we  can  realize  by  selling  the  tank 
(where  a  tank  with  level  q  has  value  q),  provided  each  day  we  choose  optimally 
whether  to  sell.  In  the  fixpoint  semantics  we  must  decide  when  to  stop:  the 
choice  of  selling  the  tank,  or  of  waiting  for  one  more  day,  corresponds  to  the 
choice  between  the  two  sides  q  and  3Pre(a;)  of  the  U  operator  (interpreted  as 
pointwise  maximum)  in  the  fixpoint.  Hence  the  fixpoint  semantics  is  suited 
for  system  control,  since  the  decision  of  which  side  of  U  to  take  corresponds 
to  a  control  decision  for  the  system. 

In  contrast,  again  setting  aside  discounting,  in  the  path  semantics  3 Oq  is 
the  expected  value  of  the  maximum  level  that  occurs  along  a  system  behavior 
(discounting  accounts  for  the  fact  that  immediate  emergencies  are  more  serious 
than  ones  that  are  farther  in  the  future).  In  the  path  semantics,  we  have  no 
control  over  stopping:  we  can  only  observe  the  value  of  q  over  infinite  runs, 
and  compute  the  expected  maximum  value  it  reaches.  Such  a  semantics  is 
well-suited  for  system  specification. 

In  DCTL,  discounting  serves  two  purposes.  First,  it  leads  to  a  notion  of  “qual¬ 
ity”  with  which  a  specification  is  satisfied.  For  example,  assume  that  we  wish 
to  reach  a  state  with  a  high  value  of  q.  Without  discounting,  the  formula  30g 
has  the  same  value,  regardless  of  the  time  required  to  reach  q;  on  the  other 
hand,  the  formula  30 aq,  for  a  <  1,  has  a  higher  value  if  the  high  q  value 
is  reached  earlier.  In  other  words,  discounted  reachability  properties  account 
not  only  for  how  well  the  goal  is  eventually  satisfied  (the  value  of  q  that  is 
reached),  but  also  for  how  soon  is  it  satisfied.  Likewise,  if  q  represents  the 
“level  of  functionality”  of  a  system,  then  the  specification  VdQ ,q  will  have  a 
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value  that  is  higher  the  longer  the  system  functions  well,  even  if  the  system 
will  eventually  always  break.  Second,  discounting  is  instrumental  in  achieving 
robustness  with  respect  to  system  perturbations.  Indeed,  we  will  show  that  for 
discount  factors  smaller  than  1,  the  value  of  Dctl  formulas  in  both  semantics 
is  a  continuous  function  of  the  values  of  the  numerical  quantities  (observations, 
transition  probabilities)  of  the  model. 

We  present  algorithms  for  model  checking  both  semantics  of  Dctl  over  tran¬ 
sition  systems,  Markov  chains,  and  Markov  decision  processes  (MDPs).  In  all 
cases  but  one  (the  VO  operator  in  the  hxpoint  semantics  of  MDPs),  the  al¬ 
gorithms  achieve  polynomial  time-complexity  in  the  size  of  the  system.  For 
transition  systems,  we  present  algorithms  for  □  and  O  that  achieve  linear- 
logarithmic  running  time,  improving  on  the  results  presented  in  the  prelimi¬ 
nary  version  of  this  paper  [9].  For  Markov  chains  and  MDPs,  the  hxpoint  and 
path  semantics  are  different;  while  the  algorithms  for  the  hxpoint  semantics 
follow  the  approach  of  dynamic  programming  [2],  the  algorithms  for  the  path 
semantics  are  novel  (and  coincide  with  those  in  [9]).  Note  that,  dne  to  the 
discounting,  Dctl  is  a  quantitative  logic  even  when  interpreted  over  purely 
boolean  state-transition  systems.  As  for  Ctl,  the  algorithms  work  recursively 
on  the  subformulas  of  a  given  formula.  Due  to  the  duality  among  the  operators, 
we  need  to  consider  only  the  cases  for  30,  VO,  and  3A. 

In  transition  systems,  the  path  and  hxpoint  semantics  coincide.  In  [9]  we  pre¬ 
sented  algorithms  for  30  and  VO  that  are  based  on  iterating  quantitative 
hxpoint  expressions;  the  resulting  time-complexity  was  quadratic.  Here,  we 
present  improved  algorithms  of  linear-logarithmic  (i.e.  n  log  n)  time  complex¬ 
ity.  The  algorithm  for  3A  (discounted  long-run  average  along  a  path)  builds 
on  both  Karp’s  algorithm  for  computing  minimum  mean-weight  cycles  and  a 
discounted  version  of  Bellman-Ford  for  computing  shortest  paths;  the  result¬ 
ing  time  complexity  is  cubic  in  the  size  of  the  transition  system.  In  all  cases, 
the  time  complexity  is  linear  in  the  size  of  the  formula. 

For  Markov  chains,  the  hxpoint  and  path  semantics  differ.  The  model- checking 
algorithms  for  the  hxpoint  semantics  rely  on  reductions  to  linear  programming, 
following  a  common  approach  in  optimal  control  [2],  The  algorithms  for  the 
path  semantics  are  based  on  a  detailed  analysis  of  the  behavior  of  the  paths 
outgoing  from  each  state.  In  both  cases,  the  time  complexity  is  polynomial 
in  the  size  of  the  system.  However,  the  time  complexity  is  exponential  in 
the  size  of  the  Dctl  formula,  clue  to  the  fact  that  the  bit-wise  encodings  of 
the  valuations  grows  exponentially  with  respect  to  the  number  of  nestings  of 
temporal  operators  (in  practice,  of  course,  one  would  be  unlikely  to  implement 
arbitrary-precision  arithmetic) . 

In  MDPs,  the  path  semantics  can  be  model-checked  via  reductions  to  linear 
programming.  The  main  difficulty  in  the  reduction  is  that  the  optimal  pol- 
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icy  with  respect  to  the  property  is  not  necessarily  memoryless;  thus,  we  must 
phrase  the  linear  programming  problem  in  terms  of  quantities  that  preserve 
this  past  dependency.  As  in  Markov  chains,  the  resulting  algorithms  are  poly¬ 
nomial  in  the  size  of  the  system,  and  exponential  in  the  size  of  the  formula. 
Lastly,  we  consider  the  model-checking  of  the  fixpoint  semantics  of  Dctl  over 
MDPs.  For  the  operators  30  and  3 A  (as  well  as  for  their  duals  VD  and  VA), 
we  show  that  we  can  compute  the  required  fixpoints  via  reductions  to  linear 
programming,  achieving  polynomial  time-complexity  in  the  size  of  the  system. 
On  the  other  hand,  for  the  VA  operator  we  present  an  algorithm  of  nondeter- 
ministic  polynomial-time  complexity  with  respect  to  the  size  of  the  system. 
The  difficulty  is  due  to  the  fact  that  the  VO  operator  combines  a  min  over 
nondeterminism  (the  V  part)  with  a  max  over  valuations  (the  O  part),  pre¬ 
cluding  known  avenues  of  reduction  to  linear  programming,  contrary  to  what 
was  claimed  in  [9].  It  is  an  open  problem  whether  this  algorithm  can  be  im¬ 
proved,  making  all  model-checking  algorithms  fall  into  polynomial  time  with 
respect  to  the  size  of  the  system. 

A  related  approach  to  the  specification  and  verification  of  quantitative  sys¬ 
tems  has  been  proposed  in  [18].  There,  the  authors  define  an  abstract  quan¬ 
titative  //-calculus,  based  on  constraint  semirings,  which  provides  a  general 
framework  for  expressing  properties  of  quantitative  transition  systems.  They 
provide  model-checking  algorithms,  based  on  the  iterative  evaluation  of  fix- 
points,  for  a  restricted  class  of  formulas  (c-Ctl).  They  also  point  out  the 
difference  between  the  path  semantics  and  the  fixpoint  semantics  of  the  pro¬ 
posed  language,  in  a  similar  fashion  to  what  was  done  in  the  preliminary 
version  of  this  paper  [9]  and  is  restated  in  the  present  work.  The  quantitative 
/./-calculus  defined  there  subsumes  the  languages  presented  in  this  work,  when 
our  definition  is  restricted  to  transition  systems.  However,  even  for  the  case  of 
transition  systems,  our  model-checking  algorithms  improve  over  the  fixpoint 
iteration  in  the  case  of  the  O  and  □  operators,  and  they  allow  for  the  efficient 
computation  of  the  A  operator,  for  which  the  standard  fixpoint  evaluation 
need  not  terminate  within  a  finite  number  of  steps. 


2  Discounted  Ctl 

2.1  Syntax 

Let  E  be  a  set  of  propositions  and  let  A  be  a  set  of  parameters.  The  Dctl 
formulas  over  (S,  A)  are  generated  by  the  grammar 

(j)  ::=  r  \  T  \  F  \  (f)\/  (f)  \  (j)  A  (j)  \  ->0  \  (j)  ©c  (j)  j  3ip  \  V0 
::=  O c(j)  |  nc(f)  |  A c(j) 
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where  r  G  E  is  a  proposition  and  c  E  A  is  a  parameter.  The  formulas  generated 
by  0  are  state  formulas ;  the  formulas  generated  by  0  are  path  formulas.  The 
DCTL  formulas  are  the  state  formulas.  The  A-free  fragment  of  Dctl  is  the 
set  of  Dctl  formulas  with  no  Ac0  subformula. 


2.2  Semantics  for  Labeled  Transition  Systems 

We  define  two  semantics  for  Dctl:  the  path  semantics,  and  the  hxpoint  se¬ 
mantics.  In  the  path  semantics,  the  path  operators  O  and  □  determine  the 
discounted  sup  and  inf  values  over  a  path,  and  the  3  and  V  operators  determine 
the  minimum  and  maximum  values  of  the  path  formula  over  all  paths  from 
a  given  state.  The  hxpoint  semantics  is  defined  by  lifting  to  a  quantitative 
setting  the  usual  connection  between  Ctl  and  //-calculus. 


Discount  factors.  Let  A  be  a  set  of  parameters.  A  parameter  interpretation 
of  A  is  a  function  (•):  A  — >  [0, 1)  that  assigns  to  each  parameter  a  real  number 
between  0  and  1,  called  a  discount  factor.  We  write  for  the  set  of  parameter 
interpretations  of  A.  We  denote  by  | qf,  the  length  of  the  binary  encoding 
of  a  number  q  E  Q,  and  we  denote  by  |  (-)  1 5  =  X0gaI(c)|6  the  size  of  the 
interpretation  (•)  of  A. 

Valuations.  Let  S  be  a  set  of  states.  A  valuation  on  S'  is  a  function  v: 
S  [0, 1]  that  assigns  to  each  state  a  real  between  0  and  1.  The  valuation  v 
is  boolean  if  v(s)  E  {0, 1}  for  all  s  E  S.  We  write  V5  for  the  set  of  valuations 
on  S.  We  write  0  for  the  valuation  that  maps  all  states  to  0,  and  1  for  the 
valuation  that  maps  all  states  to  1.  For  two  real  numbers  wi,  w2  and  a  discount 
factor  a  E  [0, 1),  we  write  u\  U  w2  for  maxjui,  u2},  u±  n  w2  for  min{wi,  w2}, 
and  U\  +a U2  for  (1  —  a)-U\  +  a-u2.  We  lift  operations  on  reals  to  operations  on 
valuations  in  a  pointwise  fashion;  for  example,  for  two  valuations  ui,u2  G  Vs, 
by  V\  U  v2  we  denote  the  valuation  that  maps  each  state  s  G  S'  to  Vi(s)  l_lu2(s). 

Labeled  transition  systems.  A  (finite-state)  labeled  transition  system 
(LTS)  S  =  ( S ,  5,  E,  [•])  consists  of  a  finite  set  S  of  states,  a  transition  relation 
5:  S  — >  25\{0}  that  assigns  to  each  state  a  finite  nonempty  set  of  successor 
states,  a  set  E  of  propositions,  and  a  function  [■]:  E  — >  V5  that  assigns  to  each 
proposition  a  valuation.  We  denote  by  |<5|  the  value  J2ses  |^(s)|.  The  LTS  S  is 
boolean  if  for  all  propositions  r  G  E,  the  valuation  [r]  is  boolean.  A  path  of  S 
is  an  infinite  sequence  •  •  •  of  states  such  that  Sj+i  G  S(si)  for  all  i  >  0. 

Given  a  state  s  G  S,  we  write  Traj s  for  the  set  of  paths  that  start  in  s. 

The  path  semantics.  The  Dctl  formulas  over  (E,  A)  are  evaluated  w.r.t. 
an  LTS  S  =  (S,  S,  E,  [•])  whose  propositions  are  E,  and  w.r.t.  a  parameter 
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interpretation  (•}  G  I  a-  Every  state  formula  0  defines  a  valuation  [0]p  G  V5: 


[rf  =  [r] 

[T]p  =  1 
[f]p  =  0 

h0]p  =  1  -  [01p 


[</>l  V  02]P  =  [0l]P  U  [02]P 

[0i  a  02]p  =  [0i ] p  n  [02JP 

[01  ©c  02JP  =  [01  ]  P  +(c)  [02]P 

[301  p(s)  =  sup { [0 1 p (p)  |  p  G  Trajs} 
[V0]p(s)  =  inf{[0]p(p)  |  p  G  Trajs} 


A  path  formula  0  assigns  a  real  [0]p(p)  €  [0, 1]  to  each  path  p  of  S: 

[Oc0]p(sosi . . .)  =  sup{(c0  •  [0]p(s0  I  i  >  0} 

[nc0lp(sosi . . .)  =  inf { 1  -  <C0  •  (1  -  [0]p(s0)  I  i  >  0} 

[Ac0]p(sosi . . .)  =  (1  -  (c))  ■  ^{(c)4  •  [0]p(s0  |  i  >  0}. 

The  term  (1  —  (c))  in  the  definition  of  [Ac0]p  is  a  normalizing  factor  ensuring 
that  [Ac0]p(p)  G  [0, 1]  for  all  paths  p. 


The  fixpoint  semantics.  In  this  semantics,  the  Dctl  formulas  are  evaluated 
with  respect  to  an  LTS  S  and  a  parameter  interpretation  (•)  G  T\.  Given 
a  valuation  x  G  Vs,  we  denote  by  3Pre(A)  G  Vs  the  valuation  defined  by 
3Pre(x)(s)  =  max{x(t)  j  t  G  5(s)},  and  we  denote  by  VPre(0)  G  Vs  the 
valuation  defined  by  VPre(x)(s)  =  min{x(t)  j  t  G  5(s)}.  The  fixpoint  semantics 
[•Jf  for  the  propositions,  the  boolean  operators,  and  ©c  is  similar  to  the  path 
semantics,  only  that  [-]p  is  replaced  by  [•]*.  The  other  operators  are  defined 
as  follows: 


I30c(pf  =  /xx. ([0 lf  U  (0  +<c)  3Pre(x))) 

(1) 

[VOc0]f  =  /xx. ([0 lf  U  (0  +<c)  VPre(x))) 

(2) 

[3DC011  =  P'X. ( [0j f  n  (1  +(c)  3Pre(x))) 

(3) 

[Vnc0]i  =  /xx. ([0 lf  n  (1  +<c)  VPre(x))) 

(4) 

[3Ac0]f  =  /xx. ([0 lf  +(c)  3Pre(x)) 

(5) 

[VAc0]f  =  /xx. ([0 lf  +(c)  VPre(x)) 

(6) 

Above,  for  a  function  F:  Vs  — >  Vs,  the  notation  p,x.F(x)  indicates  the  unique 
valuation  x *  such  that  a;*  =  F(x *).  Uniqueness  of  the  fixpoints  is  proved  in 
Theorem  1  for  the  more  general  case  of  Markov  decision  processes. 

2.3  Semantics  for  Markov  Processes 

Given  a  finite  set  S,  let  Distr(S')  be  the  set  of  probability  distributions  over 
S]  for  a  G  Distr(S'),  we  denote  by  Supp(a)  =  {s  G  S  \  a(s)  >  0}  the  support 
of  a.  A  probability  distribution  a  over  S  is  deterministic  if  a(s)  G  {0, 1}  for 
all  s  G  S. 


Markov  decision  processes.  A  (finite-state)  Markov  decision  process 
(MDP)  S  =  (S,  r,  E,  [•])  consists  of  a  finite  set  S  of  states,  a  probabilistic 
transition  relation  r:  S  — >  2Dlstl^s^  \  {0},  which  assigns  to  each  state  a  finite 
nonempty  set  of  probability  distributions  over  the  successor  states,  a  set  E  of 
propositions,  and  a  function  [•]:  E  — >  Vs  that  assigns  to  each  proposition  a 
valuation.  The  MDP  S  is  boolean  if  for  all  propositions  re  E,  the  valuation  [r] 
is  boolean.  We  denote  by  |r|&  the  length  of  the  binary  encoding  of  r,  defined 

by  EsesEaer(s)  EteSupp(a)  \a(t)\b,  and  we  denote  by  |[-]|6  =  E,esEses  M(s)\b 
the  size  of  the  binary  encoding  of  [•].  Then,  the  binary  size  of  S  is  given  by 
\S\b  =  \r\b  +  |[-]|f>. 

A  finite  (resp.  infinite)  path  of  S  is  a  finite  (resp.  infinite)  sequence  s0sis2  . . .  sm 
(resp.  Sosis2  •  •  •)  °f  states  such  that  for  all  i  <  m  (resp.  i  G  IN)  there  is 
oq  G  r(sj)  with  si+i  G  Supp(a;).  We  denote  by  FTraj  and  Traj  the  sets  of 
finite  and  infinite  paths  of  S  respectively;  for  s  G  S',  we  denote  by  Traj s  the 
infinite  paths  starting  from  s.  A  strategy  n  for  S  is  a  mapping  from  FTraj 
to  Distr(|Jses  t(s)):  once  the  MDP  has  followed  the  path  s0Si  . . .  sm  G  FTraj , 
the  strategy  n  prescribes  the  probability  7r(soSi . . .  sm)(a)  of  using  a  next- 
state  distribution  a  G  r(sm).  For  all  S0S1 . . .  sm  G  FTraj ,  we  require  that 
Supp(7r(s0si . . .  sm))  C  r(sm).  Thus,  under  strategy  7 r,  after  following  a  finite 
path  S0S1 . . .  sm  the  MDP  takes  a  transition  to  state  sm,+i  with  probability 
Jja€T(sm)  a(sm+ 1)  •  tt(-So<si  . . .  sm)(a).  We  denote  by  II  the  set  of  all  strategies 
for  S.  The  transition  probabilities  corresponding  to  a  strategy  7 r,  together  with 
an  initial  state  s,  give  rise  to  a  probability  space  ( Traj s,  Bs,  Pr^),  where  Bs  is 
the  set  of  measurable  subsets  of  2  Tm°a ,  and  Pr(f  is  the  probability  measure  over 
Bs  induced  by  the  next-state  transition  probabilities  described  above  [15,22], 
Given  a  random  variable  A"  over  this  probability  space,  we  denote  its  expected 
value  by  EJfX].  For  l  G  IN,  the  random  variable  Zp.  Traj  — >  S  defined  by 
Zi(s0si  ...)  =  si  yields  the  state  of  the  stochastic  process  after  l  steps. 


Special  cases  of  MDPs:  Markov  chains  and  transition  systems. 

Markov  chains  and  LTSs  can  be  defined  as  special  cases  of  MDPs.  An  MDP 
S  =  ( S ,  r,  E,  [•])  is  a  Markov  chain  if  |r(s)|  =  1  for  all  s  G  S.  It  is  customary  to 
specify  the  probabilistic  structure  of  a  Markov  chain  via  its  probability  transi¬ 
tion  matrix  P  =  [ps,t]s,tGS,  defined  for  all  s,t  G  S  by  ps^t  =  a(t),  where  a  is  the 
unique  distribution  a  G  r(s).  An  initial  state  s  G  S  completely  determines  a 
probability  space  ( Traj s,  Bs,  Prs),  and  for  a  random  variable  X  over  this  prob¬ 
ability  space,  we  let  ES[X]  denote  its  expectation.  An  MDP  S  =  (S,r,  E,  [•]) 
is  an  LTS  if,  for  all  s  G  S  and  all  a  G  r(s),  the  distribution  a  is  deterministic; 
in  that  case,  we  define  <5:  S  — >  2s  by  <5(s)  =  {t  G  S  \  3a  G  t(s).  a{t)  =  1}. 


The  path  semantics.  The  Dctl  formulas  over  (E,  A)  are  evaluated  with 
respect  to  a  MDP  S  =  ( S ,  r,  E,  [•])  and  with  respect  to  a  parameter  interpreta¬ 
tion  (•)  G  Z4 .  The  semantics  |t/1]p  of  a  path  formula  is  defined  as  for  LTSs;  we 
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note  that  [^]p  is  a  random  variable  over  the  probability  space  ( Traj s,  Bs,  Prs). 
Every  state  formula  </>  defines  a  valuation  [0]p  G  Vs:  the  clauses  for  proposi¬ 
tions,  boolean  operators,  and  ©c  are  as  for  LTSs;  the  clauses  for  3  and  V  are 
as  follows: 


Mp(s)  =  sup{E^(Mp)  1 7T  g  n}, 
[V^F(s)  =  inf{EJ(['0]P)  |  vr  G  n}. 


The  fixpoint  semantics.  Given  a  valuation  x:  S  — >  [0,1],  we  denote  by 
3Pre(x):  S  — >  [0, 1]  the  valuation  dehned  by  3Pre(x)(s)  =  maxaGr(s)  J2teSx(t)' 
aft),  and  we  denote  by  VPre(a;):  S  — >  [0,1]  the  valuation  dehned  by 
VPre(a;)(s)  =  minaer(s)  J2tes  x(t)a(t).  With  this  notation,  the  fixpoint  seman¬ 
tics  |-]f  is  dehned  by  the  same  clauses  as  for  LTSs. 

For  a  valuation  x  G  Vs,  we  denote  by  |k||°°  the  infinity  norm  of  x\  namely 
Ikll00  =  max{i(s)  j  s  G  S}.  For  the  hxpoint  semantics  to  be  well-defined, 
we  have  to  show  that  the  expressions  on  the  right  hand  side  of  (l)-(6) 
have  a  unique  hxpoint.  Given  an  operator  F  :  Vs  — >  Vs,  and  a  constant 
3  G  [0,lh  we  say  that  F  is  a  /3-contraction  if  and  only  if,  for  all  x,y  G  Vs, 
\\F(x)-F(y)\\°°<P-\\x-y\\°°. 

Lemma  1  The  operators  occurring  in  the  right  hand  side  of  (l)-(6)  are  (c)- 
contractions. 

Proof.  Let  a  =  ( c )  and  q  =  [</>]f.  Given  two  valuations  x,  y  G  Vs,  let 
Ik  ~~  2/ 1| 00  =  £.  Let  F  be  the  operator  used  in  (1),  namely  F(x)  =  gUo:3Pre(a;). 
For  all  s  G  S,  there  are  a,  b  G  r(s)  such  that: 

F(x)(s)  =  q(s)  LI  a  ^  x(t)a(t), 
tes 

F(y)(s)  =  q(s)Ua^y(t)b(t). 

tes 

We  prove  that  |F(x)(s)  —  F(y)(s)\  <  a  ■  £.  The  result  is  trivial  if  F(x)(s)  = 
F(y)(s)  =  q{s).  If  F(x)(s)  =  q(s )  and  F(y)(s)  >  q(s)  we  have 

\F(x){s)  -F{y)(s)\  =  F(y)(s)  -  q(s )  <  a  fel/(t)b(t)  -J^x(t)a(t) 

\tes  tes 

Symmetrically  for  the  case  when  F(y)(s)  =  q(s )  and  F(x)(s)  >  q(s).  Thus,  in 
all  cases  it  is  sufficient  to  prove  that  |  J2tes  x(t)a{t)  —  J2tes  y(t)b(t)  \  <  f.  As¬ 
sume  by  contradiction  that  Jftes  x(t)a(t)  —  J2teS  y(t)bft)  >  £■  Then,  the  value 
3Pre(y)(s)  can  be  increased  by  taking  action  a  instead  of  b ,  in  contradiction 
with  the  hypothesis  that  b  is  the  best  action  from  s.  Formally,  we  have 

I  '52v(t)a(t)  -  £>(t)a(t)|  =  l  YXvit)  ~  xW)aWI  <  I  =  F 

tes  tes  tes  tes 
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and  thus  J2tes  y(t)a(t)  —  x(t)a(t)  —  £  >  J2tes  y(t)b(t).  Similar  arguments 
hold  for  the  remaining  operators.  □ 


As  a  consequence  of  this  lemma,  and  by  the  contraction  mapping  theorem, 
we  have  immediately  that  the  r.h.s.  of  (l)-(6)  have  a  unique  hxpoint,  showing 
that  the  hxpoint  semantics  is  well-defined. 

Theorem  1  The  right  hand  side  of  (l)-(6)  have  a  unique  fixpoint. 


2.4  Properties  of  Dctl 

Throughout  the  rest  of  the  paper,  unless  differently  specified,  we  fix  a  param¬ 
eter  interpretation  (•),  a  set  of  propositions  E,  a  proposition  r  G  E,  and  a 
parameter  c  and  write  [r]  =  q  and  (c)  =  a.  We  omit  the  superscripts  p  and  f 
and  just  write  [0]  if  the  path  and  hxpoint  semantics  of  0  coincide. 

2.4.1  Duality  laws.  For  all  state  formulas  0  over  (E,A),  all  MDPs  with 
propositions  E,  and  all  parameter  interpretations  of  A  and  *  G  {p,  f},  we  have 
the  following  equivalences: 

[-i3Oc0]*  =  [\/nc-i0]*  |-i3nc0]*  =  [VOc—i0]*,  [->3AC0J*  =  [VAc-i0]*. 

In  particular,  we  see  that  Ac  is  self-dual  and  that  a  minimalist  definition  of 
Dctl  will  omit  one  of  {t,  f},  one  of  {V,  A},  and  one  of  {3,  V,  O,  □}  1 . 

2.4.2  Comparing  the  two  semantics.  We  show  that  the  path  and  hx¬ 
point  semantics  coincide  over  transition  systems,  and  over  Markov  systems 
with  boolean  propositions  (for  non- nested  formulas),  but  do  not  coincide  in 
general  over  (non-boolean)  Markov  chains.  This  result  indicates  that  the  stan¬ 
dard  connection  between  Ctl  and  //-calculus  breaks  down  as  soon  as  we  con¬ 
sider  both  probabilistic  systems  and  quantitative  valuations.  The  reason,  es¬ 
sentially,  is  that  in  probabilistic  systems  with  quantitative  evaluations,  the 
operator  U  does  not  commute  with  the  expectation  operator  E.  We  start  by 
proving  that  the  two  semantics  always  coincide  for  the  Ac  operator. 

Theorem  2  For  all  MDPs  with  propositions  E  ,  all  parameter  interpretations 
of  c,  and  all  r  G  E,  we  have  [3Acr]p  =  [3Acr]f  and  [VAcr]p  =  [VAcr]f. 

Proof.  We  prove  the  result  for  3,  as  the  case  for  V  is  analogous.  Let  v  = 
[3Acr]f  and  u  =  [3Acr]p.  Writing  out  the  definitions  of  the  hxpoint  and  path 

1  One  cannot  remove,  say,  both  3  and  O  because  the  negation  is  not  allowed  in 
front  of  path  formulas. 
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semantics,  we  have,  for  all  s  E  S: 


v{s) 

u(s) 


(1  —  a)-q(s)  +  a ■  max 


a£r(s) 


tes 


(1  —  a)  sup  E* 


7rGll 


Z>Vz(Zi) 

i>  0 


(7) 

(8) 


To  prove  that  u  =  v,  we  prove  that  u  is  a  hxpoint  of  (7).  For  all  s  G  S,  we 
have: 


(1  —  a)-q(s)  +  a-  max 


aGr(s) 


t£S 


=  (1  —  a)-q(s)  +  a ■  max  ^  a(t)-(  1  —  a)  -  sup  Ej1 

aeTh)  7ren 


=  (1  -  a)- 
=  (1  -  a)- 


q(s)  +  a ■  max  ^  a{t)  -  sup  E^ 

CLGt(s)  7rGn 


i>0 


i>0 


?(s)  +  sup  E^ 


7rGll 


i>l 


=  (1  —  a)  -  sup  E^ 

7TGI1 


£a‘?(Zi) 

.i>0 


=  «(«)■ 


□ 


In  the  following  result,  we  prove  that  on  LTSs  the  two  semantics  coincide  for 
O cr  and  mcr  formulas.  We  hrst  introduce  some  notation.  It  is  a  classical  result 
from  hxpoint  theory  that  [VOcr]f  =  Hindoo  vn,  where  vn  is  dehned  as  follows. 

v0(s)  =  q(s)  vn+1(s)  =  q(s)  U  a  •  min{un(s/)  |  s'  G  <5(s)}. 


Let  [VO^’r]p  denote  the  path  semantics  of  the  formula  VO cr  when  only  the 
hrst  k  +  1  states  of  each  trajectory  are  considered,  that  is, 

[VO^r]p(s)  =  inf  sup  cd-g(sj). 

s0si...£Traj(s)  o <i<k 

Then,  the  following  holds. 

Lemma  2  For  each  step  k  >  0,  we  have  =  [VO^'r]p. 
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Proof.  The  case  k  —  0  is  trivial,  since  Vo(s)  =  q(s)  =  infSosi...eTraj(s)  niin{a°  • 
q(s0)}.  For  the  inductive  step,  assume  the  thesis  holds  for  some  k.  Then, 


[VC>^+1r]p(s)  =  inf  sup  a 1  ■  q(si ) 

pG  Traj(s)  0<i<fc+l 

=  inf  max{g(s),  sup  a1  ■  g(s0} 

pG  Traj(s)  0<i<k+l 

=  max{g(s),  inf  sup  a1  ■  g(sj)} 

pG  Traj(s)  0<i<k+l 

—  max{g(s),  a  ■  min  inf  sup  a1  ■  q(si)} 

s'£S(s)  pdTraj{s')  0<i<fc+l 

=  vk+i(s). 


□ 


The  following  theorem  summarizes  the  relations  between  the  semantics. 
Theorem  3  The  following  assertions  hold: 

(1)  For  all  LTSs  with  propositions  E,  all  parameter  interpretations  of  A,  and 
all  Dctl  formulas  0  over  (E,  A),  we  have  [0]p  =  [0]f. 

(2)  For  all  boolean  MDPs  with  propositions  E,  all  parameter  interpretations 
of  A,  and  all  Dctl  formulas  0  over  (E,  A)  that  contain  no  nesting  of 
path  quantifiers,  we  have  [0]p  =  [0]f. 

(3)  There  is  a  Markov  chain  S  with  propositions  E,  a  parameter  interpreta¬ 
tion  A,  and  a  Dctl  formula  0  over  (E,A)  such  that  [0]p  ^  [0]f. 

Proof.  Part  1  is  proved  by  structural  induction  on  0.  The  cases  □  and  O  are 
a  consequence  of  Lemma  2.  The  case  A  is  a  consequence  of  Theorem  2. 

Part  2  follows  from  the  equivalence  between  the  linear  and  the  branching 
semantics  of  /i-calculus  in  the  case  of  strongly  guarded  formulas,  as  detailed 
in  Theorem  6  of  [7]  and  Theorem  3  of  [10]. 

Part  3  is  witnessed  by  the  Markov  chain  in  Figure  1.  Formally,  S  =  (S,  r,  E,  [•]) 
with  S  =  {s0,  Si,  S2},  E  =  {r},  and  [s0]  =  0.2,  [si]  =  1,  and  [s2]  =  0.  From  state 
s0,  there  are  two  transitions,  to  Si  and  s2,  having  probability  1/2  each;  states  Si 
and  s2  are  sinks.  We  consider  the  Dctl  formula  30cr,  along  with  a  discount 
factor  interpretation  such  that  a  =  0.8.  According  to  the  path  semantics, 
there  are  two  paths  from  So,  each  followed  with  probability  1/2:  the  first  path 
has  discounted  sup  equal  to  0.8,  and  the  second  has  discounted  sup  equal 
to  0.2;  hence,  [3<>cr]p(so)  =  (0.8  +  0.2)/2  =  0.5.  According  to  the  fixpoint 
semantics,  a  ■  3Pre(g)  at  s0  is  0.8(1  +  0)/2  =  0.4,  and  max{0.2,0.4}  =  0.4; 
thus,  [3Acr]f(so)  =  0.4.  □ 
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2.4.3  Robustness.  Consider  two  MDPs  S  =  (S,  r,  E,  [•])  and  S'  = 
(i S ,  t',  E,  [•]')  with  the  same  state  space  S  and  the  same  set  E  of  propositions. 
We  define 


=  max-j  max  | [r] (s)  -  [r]'{s)\, 

max  min  V' |a(V)  —  &(V)|, 

aer(s)  ber'(s)  gl(.s 

max  min  |a(Y)  —  5(s/)|'j 
feer'(s)  aGr(s)  gl(_s  J 


It  is  not  difficult  to  see  that  |-,  -||  is  a  metric  on  the  MDPs  with  state  space 
S.  Such  metric  first  considers  one  state  at  a  time.  For  each  state,  the  local 
distance  is  given  by  the  maximum  difference  between  the  value  of  a  proposition 
in  said  state  in  the  two  systems.  The  one-step  distance,  instead,  considers  the 
best  way  a  transition  from  the  first  system  can  be  matched  by  a  transition  in 
the  second,  and  viceversa.  For  each  state,  the  maximum  of  the  local  distance 
and  the  one-step  distance  is  taken.  Finally,  the  maximum  over  all  states  is 
taken. 

For  an  MDP  S  and  a  parameter  interpretation  (•),  we  write  J-]^  ^  and  [-Jff  ^ 
to  denote  the  two  semantics  functions  defined  on  S  with  respect  to  (•).  The 
following  theorem  characterizes  the  continuity  of  the  fixpoint  and  path  seman¬ 
tics. 

Theorem  4  Let  S  and  S'  be  two  MDPs  with  state  space  S,  and  let  (•)  be  a 
parameter  interpretation. 

(1)  For  all  e  >  0,  for  all  Dctl  formulas  (f)  and  for  all  states  s  G  S,  if 
|S,<S'||  <  e,  then  |[0JsiO(s)  -  [</>ls/,<.>(s)|  <  e. 

(2)  For  all  Dctl  formulas  4>,  for  all  e  >  0,  there  is  a  5  >  0  such  that  for  all 
states  s  G  S,  if  ||<S,iS'||  <  5,  then  |[0]^  ^(s)  —  [0]£,  ^(s)|  <  e. 

Proof.  We  first  prove  statement  (1),  by  induction  on  the  structure  of  formu¬ 
las.  Fix  e  >  0.  If  (f  is  a  proposition  r,  then  |r(s)  —  r(s)|  <  ||«S, 5'||  <  e.  The 
cases  T,  F  are  obvious.  If  <f>  =  -xf1,  then 

imw  -  h^fAs)\  =  |i  -  m^s)  - 1  +  mus)  i 

and  the  result  follows  by  induction.  Similarly,  Boolean  operations  are  trivial 
by  induction. 

Now  consider  the  path  formulas.  The  technical  result  we  need  is  that 
|3Pre(/)s  —  3Pre(/)s/|  <  ||<S, «S; ||  for  all  valuations  /,  where  3Pr e(/)s  is  the 
predecessor  operator  in  MDP  S ,  and  similarly,  3Pre(/)ls/  is  the  predecessor 
operator  in  S'. 
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In  order  to  prove  the  result  for  path  formulas,  we  use  the  fixpoint  definitions 
of  the  semantics  of  the  path  formulas.  Also,  we  only  look  at  formulas  of  the 
form  30,  since  [V0]f  =  1  —  [3-i0]]f  will  follow  from  the  induction  hypothesis. 
We  show  the  proof  for  0  =  3OC0'.  By  induction  on  the  structure  of  formulas, 
|[</>ls(s)  —  [01s' (s)  I  <  e  whenever  ||<S,  «S7 1|  <  e.  The  sequence  x0  =  0,  xn+i  = 
[01s  V  a3Pre(a;n)ls  converges  to  [0]^.  Similarly,  the  sequence  yo  =  0,  yn+  \  = 
[01  s/  V  o;3Pre(|/n)ls'  converges  to  [0]^,.  We  shall  use  induction  on  n  to  show 
that  these  two  sequences  are  close.  For  the  base  case,  we  have  |£0(s)  —  y0(<s)|  = 
0  <  e.  Assume  by  induction  on  n  that  |a;n(s)  —  yn(s)\  <  e  whenever  ||«S,  <S'||  <  e. 
Then  (the  induction  case) 

|®n+i(s)  -  J/n+i(s)|  =  |([0'ls  v  a3Pre(xn))(s)  -  ([0']^,  V  a:3Pre(j/n))(s)|  <  e. 

Thus,  |[3Oc0']s(s)  —  [3Oc0']s,(s)|  <  e  whenever  ||<S,  <S'||  <  e.  The  cases  for  3D 
and  3A  are  similar. 

We  now  proceed  to  statement  (2).  Notice  that  unlike  statement  (1),  we  first  fix 
a  formula,  and  let  5  depend  both  on  e  and  on  the  formula.  We  shall  prove  the 
theorem  by  induction  on  the  structure  of  the  formula.  The  cases  for  proposi¬ 
tions  and  boolean  operations  are  as  before,  we  focus  on  path  formulas.  Again, 
we  only  consider  existential  formulas. 

Consider  the  formula  0  =  3OC0'.  By  induction,  we  have  proved  for  0'  that 
for  every  e  >  0  there  is  5  >  0  such  that  |  I0/]§(S)  —  [0ls'(s)|  A  e  when¬ 
ever  | <S,  5' |  <  5.  Now  fix  an  e  >  0.  We  give  a  simple  bound  on  <5  such  that 
IMJsl5)  —  [0ls'(s)l  A  e  whenever  ||5,5,||  <  5.  First,  we  need  only  look  at  finite 
paths:  choose  N  such  that  cN  <  where  c  is  the  discount  factor.  Then,  the 
contribution  of  any  term  occurring  more  than  N  steps  in  the  future  the  path 
is  bounded  by  e/2.  The  difference  |[3OC0']^  —  [3OC0']^,  |  is  certainly  bounded 
by  \S\N  ■  ( Ne ')  -e',  where  the  first  term  gives  the  number  of  sequences  of  length 
N,  the  second  gives  the  difference  in  probabilities  in  the  two  MDPs  along  any 
behavior  of  length  N,  and  the  third  term  gives  the  difference  in  the  valuations 
of  0'  in  the  two  MDPs.  The  value  e1  will  be  chosen  judiciously  as  follows. 
For  the  above  bound  to  satisfy  |S’|ArAe'2  <  e/2,  we  must  set  e'  <  (^ypiy)1- 
Now  by  induction  on  0',  we  construct  a  S'  such  that  |[0/]Js(s)  —  [01s' (s)|  <  e' 
whenever  ||5,5,||  <  S',  where  we  further  ensure  that  S'  <  e'  (by  taking  the 
minimum  of  the  constructed  5  and  e').  Then,  this  S'  has  the  property  that 
whenever  ||*S,  |  <  S',  we  have  |[0]^(s)  —  [0]s/(s)|  <  e.  The  case  3DC0'  is  sim¬ 

ilar.  Finally,  the  case  3AC0'  follows  from  part  (1)  because  the  path  semantics 
is  the  same  as  the  fixpoint  semantics  for  Ac.  □ 

Notice  that  in  the  continuity  statement  for  the  path  semantics,  S  depends  on 
the  formula  in  addition  to  e.  In  general,  an  iterated  application  of  the  30 
operator  can  amplify  an  arbitrary  small  difference  in  probabilities,  as  shown 
by  the  following  example. 
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Fig.  2.  A  Markov  chain  illustrating  Example  1. 

Example  1  Consider  the  three-state  Markov  chain  S  = 
({s0,  St,  s2},  t,  {r},  [•])  in  Figure  2.  As  shown  in  the  picture,  t(s0)  is  the 
distribution  that  chooses  Si  with  probability  1  —  e  and  s2  with  probability 
e,  and  r(sj)  chooses  s*  with  probability  1  for  i  =  1,2.  We  then  have 
q(so)  =  q(s t)  =  0  and  g(s2)  =  1.  Consider  the  Markov  chain  S '  that  differs 
from  S  in  that  r(so)  chooses  si  with  probability  1.  Then  ||*S,  «Sr ||  =  e.  Now 
consider  the  formulas  (30c)nr,  for  n  >  1.  Let  xn  =  |(30c)nr]^ Ts0)-  Then 
xn+i  =  (1  *-  e)  •  xn  +  a  ■  e,  and  the  limit  as  n  goes  to  oo  is  a.  On  the  other 
hand,  |(30c)nr]^,  ^(s0)  =  0  for  all  n. 

3  Model  Checking  Dctl 

The  model-checking  problem  of  a  Dctl  formula  0  over  a  system  with  respect 
to  one  of  the  two  semantics  *  G  (p,  f}  consists  in  computing  the  value  |0]*(s) 
for  all  states  s  of  the  system  under  consideration.  Similarly  to  Ctl  model 
checking  [4],  we  recursively  consider  one  of  the  subformulas  0  of  0  and  compute 
the  valuation  [0]*.  Then  we  replace  0  in  0  by  a  new  proposition  p ^  with 
\Pi]  —  [V’]*-  Because  of  the  duality  laws  stated  in  Section  2.4.1,  it  suffices  to 
focus  on  model  checking  formulas  of  the  forms  30cr,  VOcr,  and  VA cr,  for  a 
proposition  r  G  E.  We  will  present  the  algorithms,  for  both  semantics,  over 
transition  systems  in  Section  3.1,  over  Markov  chains  in  Section  3.2,  and  over 
MDPs  in  Section  3.3. 

For  complexity  analyses,  we  assume  that  operations  on  reals  (comparison, 
addition,  and  multiplication)  can  be  performed  in  constant  time:  in  other 
words,  we  provide  the  asymptotic  complexity  of  each  algorithm  in  terms  of 
the  number  of  arithmetic  operations. 


3.1  Model  Checking  Dctl  over  Transition  Systems 

We  fix  a  finite  LTS  S  =  ( S ,  5,  E,  [•]).  As  stated  in  Theorem  3,  the  two  semantics 
of  Dctl  coincide  over  LTSs.  Hence,  only  one  algorithm  is  needed  to  model 
check  a  formula  in  either  semantics. 

The  hxpoint  semantics  of  3Acr  and  VOcr  (equations  (1)  and  (2))  suggest 
approximation  algorithms  for  evaluating  the  corresponding  formulas  over  LTSs 
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by  Picard  iteration.  For  instance,  [30cr]f  =  lim)WOO  vni  where  i>o(s)  =  q(s), 
and  nn+i(s)  =  q(s)  U a-  max{nn(s/)  |  s'  £  <5(s)}  for  all  n  >  0.  Lemmas  4  and  10 
show  that  these  algorithms  reach  their  hxpoints  within  \S\  steps,  thus  yielding 
exact  algorithms  rather  than  approximations.  As  stated  in  [9],  this  leads  to 
algorithms  for  model  checking  30cr  and  VO cr  in  time  0(|5|  •  |h|).  Here,  we 
provide  improved  algorithms  of  complexity  O ( | <5|  +  \S\  log  |S'|). 

The  algorithms  use  a  priority  queue  data  structure,  that  provides  the  following 
functionalities. 

•  insert(Q,t,x ):  Inserts  element  t  in  the  queue  Q  and  it  assigns  priority  x  to 
t. 

•  empty  (Q):  Returns  true  if  the  queue  Q  is  empty,  and  false  otherwise. 

•  extract-max  (Q):  Returns  a  pair  (t,  x ),  where  t  is  an  element  with  the  highest 
priority  in  Q  and  x  is  its  priority,  and  it  removes  t  from  the  queue. 

•  increase-key (Q,t,x):  If  t  belongs  to  the  queue  Q  and  its  current  priority 
is  smaller  than  x,  then  increases  its  priority  to  x,  otherwise  it  leaves  Q 
unchanged. 

By  using  heaps,  we  can  implement  these  procedures  such  that  computing 
the  function  empty  takes  time  0(1),  while  the  other  functions  require  time 
O(logn),  where  n  is  the  number  of  elements  in  the  queue,  see  for  instance  [5]. 

3.1.1  Model  checking  30.  Informally,  the  idea  behind  the  improved  al¬ 
gorithm  for  30r  is  as  follows.  First,  we  set  u(s)  =  q(s)  for  all  states  s,  and 
we  mark  the  states  “not-done”.  Then,  we  iteratively  pick  the  not-done  state 
s  having  the  largest  value  of  u(s),  and  we  mark  it  “done”;  we  also  propagate 
to  all  its  predecessors  t  the  value  u(t)  :=  u(t)  U  a  ■  u(s).  We  show  that  when 
a  state  is  marked  “done”,  it  holds  u(s)  =  |30cr](s).  This  algorithm  can  be 
implemented  using  a  priority  queue  that  contains  all  the  “not-done”  states,  as 
follows. 

Algorithm  1 

function  ExistsDiamond(t5,  a,  q) 
vars: 

val  :  state  array  of  rationals 
Q  :  priority  queue 

init: 

for  each  t,  £  S  do 
insert(Q ,  t,  q(t)) 

done 

main: 

while  not  empty  (Q)  do 

(t,val[t\)  :=  extract-max (Q) 
for  each  s  such  that  t  E  S(s)  do 
increase-key (Q,  s,  a  ■  val[t]) 

done 
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done 

return  val 


To  show  the  correctness  of  the  algorithm,  we  first  observe  that  the  value 
[Ocr]  (p)  is  attained  at  the  first  occurrence  of  a  certain  state  in  the  path  p  and 
that  the  value  [30cr](s)  is  attained  on  an  acyclic,  finite  path. 

Lemma  3  Given  a  path  p  =  so^i . . .  G  Traj(s),  there  exists  k  >  0  such  that: 
(i)  for  all  0  <  i  <  k:  s*  Sk,  and  (ii)  |Ocr](p)  =  akq(sk )■ 


Proof.  Consider  a  path  p  =  so<si . . .  and  let  A"  =  { a 1  ■  q(sj)  \  i  >  0}  and 
u  =  sup  A"  =  [Ocr](p).  Consider  two  indices  0  <  j  <  k,  such  that  Sj  =  Sk- 
Then,  ak  ■  q(sk)  =  ak  ■  q(sj )  <  ad  •  q(sj )  <  u.  In  words,  the  discounted  values 
of  r  at  state  Sk  is  smaller  than  every  previous  occurrence  of  the  same  state. 
□ 

Lemma  4  Given  a  state  s  G  S,  there  is  a  finite,  acyclic  path  so<si . . .  Sk  G 
FTraj(s)  such  that: 

|30cr](s)  =  akq(sk). 

Proof.  From  Lemma  3.  □ 

Lemma  5  During  the  execution  of  Algorithm  1,  it  is  always  true  that,  if  state 
s  does  not  belong  to  the  queue,  then  val[s]  is  greater  than  or  equal  to  the 
maximum  priority  in  the  queue. 

Proof.  At  the  beginning,  the  property  is  trivially  true.  If  the  property  is 
true  at  some  moment,  and  extract-max  is  called,  the  property  remains  true. 
Moreover,  if  the  property  is  true  at  some  moment,  and  increase-key (Q,  t,  •) 
is  called,  the  property  also  remains  true.  This  follows  from  the  fact  that  the 
priority  of  t  can  only  be  increased  to  a  ■  x,  where  x  is  the  priority  of  a  state 
which  has  just  been  removed  from  the  queue.  □ 


The  following  results  use  the  abbreviation  v(s)  for  [30cr](s). 

Lemma  6  During  the  execution  of  Algorithm  1,  a  state  s  is  never  assigned  a 
priority  greater  than  v(s). 

Proof.  The  property  holds  after  initialization  because  then  val[s]  =  q(s)  < 
v(s).  Assume  that  the  property  is  true  and  that  the  function  increase-key  is 
called.  Then  there  exists  a  pair  (t,  val[t])  with  t  G  <5(s)  and  which  was  just 
removed  from  the  queue.  Applying  the  definition  of  v  and  the  assumption 
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yields  v(s)  >  a  ■  v(t)  >  a  ■  val[t\  =  val[s].  All  other  statements  trivially 
preserve  this  property.  □ 


Lemma  7  During  the  execution  of  Algorithm  1,  when  a  state  s  is  extracted 
from  the  queue,  val[s]  =  v(s). 

Proof.  For  all  states  s  E  S,  let 

ls  =  min {|/9  |  p  is  a  hnite,  acyclic  path  between  s  and  t  with  v(s)  =  a^q(t)}. 


Note  that  Lemma  4  ensures  that  the  above  minimum  is  never  taken  over  an 
empty  set.  We  prove  our  statement  by  induction  on  ls. 

If  ls  =  0,  we  have  v(s)  =  q(s).  Then,  Lemma  6  guarantees  that  the  priority  of 
s  is  never  increased  from  its  initial  value  of  q(s),  and  we  obtain  the  result. 

If  ls  >  0,  then  we  have  v(s)  =  rna xt£s(s)  o:v(t),  say  v(s)  =  av(t0).  Moreover, 
ls  >  lf0.  Now,  consider  the  moment  where  to  was  extracted  from  the  queue.  By 
induction  hypothesis,  we  have  val[to]  =  v(to)-  Lemma  6  yields  that  val[s]  < 
v(s)  =  av(t0 )  =  aval[t0\.  Then,  in  particular,  Lemma  5  implies  that  s  was  still 
in  the  queue  when  t0  was  extracted.  Since  t0  E  <5(s),  the  function  increase-key 
is  called  and,  since  val[s]  <  av(to),  this  call  sets  val[s]  to  its  final  value  av(t o). 
□ 


The  following  is  a  direct  consequence  of  Lemma  7. 

Lemma  8  [Correctness]  Let  val  =  ExistsDiamond(5,  a,  q).  For  all  s  E  S, 
val[s]  =  pOcr](s). 

Lemma  9  [Complexity]  Algorithm  1  runs  in  time  0(|5|  +  |S|  log|5|). 

Proof.  The  initialization  phase  alone  takes  time  0(|S|  log  151).  In  each  itera¬ 
tion  of  the  main  loop,  a  state  t  is  extracted  from  the  queue  and  increaseJcey  is 
called  on  all  predecessors  of  t.  Thus,  increase-key  may  be  called  several  times 
on  each  state.  However,  the  priority  of  a  state  s  can  only  be  increased  once. 
To  see  this,  assume  that  at  some  point  the  priority  of  s  is  increased  to  the 
value  aval\t\.  It  holds  that  all  the  states  that  are  still  in  the  queue  have  pri¬ 
ority  at  most  val[t}.  Therefore,  the  priority  of  s  cannot  be  further  increased. 
Considering  that  increas e-key (•,  s,  •)  runs  in  constant  time  unless  it  actually 
increases  the  value  of  s,  the  complexity  of  the  main  loop  reduces  to  examining 
every  edge  in  the  LTS  (time  O ( | <5| )  ),  plus  increasing  the  value  of  each  state 
at  most  once  (time  0(|5|  log  |5|)  ).  □ 
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Notice  that  if  we  want  to  compute  [30cr]  on  a  fixed  state  s,  we  can  achieve 
a  smaller  complexity  by  exploiting  the  equation 

J=IOcr](s)  =  max{as^s’^  •  q(t)  \  t  G  S}, 

where  sp(s,  t )  is  the  length  of  an  (unweighted)  shortest  path  from  s  to  t.  The 
values  sp(s,  t)  can  be  computed  by  a  breadth  first  search  over  the  LTS,  yielding 
time  complexity  0(|S|  +  |5|)  for  the  above  algorithm. 


3.1.2  Model  checking  VO.  The  algorithm  for  VO  is  similar  to  the  algo¬ 
rithm  for  30,  except  that  the  valuation  of  a  state  is  increased  when  all  of  its 
successors  are  marked  “done”,  rather  than  each  time  a  successor  is  marked 
“done”.  Again,  the  algorithm  can  be  implemented  using  a  priority  queue,  as 
follows. 

Algorithm  2 

function  ForallDiamond(5,  a,  q) 
vars: 

val  :  state  array  of  rationals 
count  :  state  array  of  integers 
Q  :  priority  queue 

init: 

for  each  t  G  S  do 
count[t ]  :=  | S(t) | 
insert(Q ,  t,  q(t)) 

done 

main: 

while  not  empty  (Q)  do 

(t,val[t])  :=  extract-max  (Q) 
for  each  s  such  that  t  e  S(s)  do 
count[s\  :=  count[s]  —  1 

if  count[s\  =  0  then  increase-key (Q,  s,  a  ■  val[t]) 

done 

done 

return  val 


When  proving  the  correctness  of  Algorithm  2,  we  use  the  short  notation  v(s) 
for  |VOcr](s).  Moreover,  we  denote  by  prio(s)  the  priority  of  s,  when  s  is  a 
state  belonging  to  the  queue. 

For  all  so  €  S,  we  define  STraj(so)  (S  stands  for  “simple”)  to  be  the  set  of 
all  finite  paths  p  —  so ...  sn  such  that:  (i)  p  is  acyclic  (no  state  repetitions), 
and  (ii)  it  can  be  extended  in  one  step  to  a  cyclic  path,  i.e.  there  is  i  <  n  s.t. 
Si  G  S(sn).  Notice  that  STraj(s0 )  is  finite  and  every  path  in  STraj(s0 )  contains 
at  most  \S\  —  1  steps.  In  the  statement  of  the  following  lemma,  we  assume  that 
the  semantics  of  Opr  is  extended  to  finite  paths  in  the  obvious  way. 
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Lemma  10  In  order  to  compute  the  value  o/[VOcr](s),  only  paths  in  STraj(s) 
need  to  be  considered.  Formally, 

[VOcr](s)  =  min  [Ocr](p). 

peSTraj(s) 

Moreover,  there  is  a  path  p  G  Traj(s)  such  that  |VOcr](s)  =  [Ocr](p). 

Proof.  We  first  prove  that  infpe7Vaj(s)[Ocr](p)  >  minpGSTrai(s)[Ocr](p).  Let 
p*  G  STraj  (s)  be  any  path  such  that  [Ocr](p*)  =  minpesTraj(s)[^<T](p)-  If  we 
prove  that  every  path  p  G  Traj(s)  gives  a  value  for  O cr  greater  than  the  one 
of  p*,  we  are  done.  Take  any  path  p  =  s0Si ...  in  Traj(s).  Let  p'  be  the  longest 
prefix  of  p  which  is  an  acyclic  path.  Clearly,  p'  G  STraj.  Since  p'  is  a  prefix  of 
p,  it  holds  that  [Ocr](p)  >  [Ocr](p')  >  [Ocr](p*). 

Conversely,  we  prove  that  infpeTroj(s)[Ocr](p)  <  min(;,eS'7vty(s)[Ocr](/o).  We  do 
this  by  showing  that  every  element  in  STraj  (s)  has  a  corresponding  element 
in  Traj(s)  which  assigns  the  same  value  to  Ocr,  thus  also  proving  the  second 
statement  of  the  Lemma. 

Let  p  =  s0 . . .  sn  be  an  element  of  STraj  (s)  and  let  psn+ 1  be  an  extension  of 
p  which  is  a  cyclic  path.  Formally,  s„+i  =  Sj  for  some  0  <  j  <  n.  Let  p'  be 
the  infinite  path  obtained  by  repeating  forever  the  loop  in  psn+ 1,  i.e.  p'  = 
so  .  •  •  Sj-i(sj  . . .  sn )u>.  By  Lemma  3,  the  value  [Ocr](p')  =  supi>0  a1  ■  [rKp^i)) 
is  attained  at  the  first  occurrence  of  some  state.  By  construction  of  p'  this 
state  must  occur  in  the  first  n  +  1  states  of  p' ,  which  are  the  original  states  of 
p.  Therefore,  [Ocr](p')  =  sup^^  W  •  |r](s;)  =  [Ocr](p).  □ 

Lemma  11  During  the  execution  of  Algorithm  2,  it  is  always  true  that,  if 
state  s  does  not  belong  to  the  queue,  then  val[s]  is  greater  than  or  equal  to  the 
maximum  priority  in  the  queue. 

Proof.  As  for  Lemma  5.  □ 

Lemma  12  During  the  execution  of  Algorithm  2,  a  state  s  is  never  assigned 
a  priority  greater  than  v(s). 

Proof.  By  contradiction,  let  s  be  the  first  state  whose  priority  is  modified  to 
a  value  greater  than  v(s),.  by  means  of  a  call  to  increase Jzey (Q,  s,  a  ■  val[t]). 
Notice  that  this  can  only  happen  if  count[s)  =  1  when  t  is  extracted  from  the 
queue. 

Since  s  is  the  first  such  node,  the  priority  of  t  was  never  set  to  a  value  greater 
than  v(t).  Thus,  after  the  above  call  to  increaseJcey,  we  have  v(s)  <  prio(s)  = 
a-val[t\  <  a-v(t).  Considering  Lemma  10,  let  p  G  Traj(s)  be  a  path  such  that 
|Ocr](p)  =  v(s).  It  must  be  [Ocr](p)  <  a  ■  v(t).  Let  p  =  (ssis2 . . .),  we  claim 
that  the  state  .S]  is  still  in  the  queue  when  t  is  extracted,  thus  contradicting 
the  assumption  that  count[s\  =  1  when  t  is  extracted. 
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If  si  was  extracted  before  t,  by  Lemma  11  we  get  val[s i]  >  val[t\.  Therefore,  we 
get  from  the  initial  assumption  that  v(s)  <  a-val[t\  <  a-val[s i]  <  a-u(si)  < 
a  ■  [Ocr](siS2  •  •  .),  and  at  the  same  time  v(s)  =  q(s )  U  a  ■  [Ocr](siS2  . . .)  > 
a  ■  [Ocr](siS2  . . .),  which  is  a  contradiction.  □ 

Lemma  13  During  the  execution  of  Algorithm  2,  when  a  state  s  is  extracted 
from  the  queue,  val[s ]  =  v(s). 

Proof.  For  all  states  s,  let  ACFS  be  the  set  of  acyclic  (and  thus  finite)  paths 
p  starting  at  s  that  satisfy: 

•  if  t  is  the  last  state  of  p,  then  v(t)  =  q(t),  and 

•  if  t  is  a  state  of  p  that  is  not  the  last,  then  v(t)  >  q{t). 

Note  that  this  set  is  non-empty.  To  see  this,  let  f*  be  the  state  with  highest 
w- value  among  those  reachable  from  s;  then,  ACFS  must  contain  a  prefix  of 
each  acyclic  path  from  s  to  f*.  Let  also  ls  =  max{|p|  |  p  G  ACFS}.  We  prove 
our  statement  by  induction  on  ls. 

If  ls  =  0,  we  have  v(s)  =  q(s).  Then,  Lemma  12  guarantees  that  the  priority 
of  s  is  never  increased  from  its  initial  value  of  q(s),  and  we  obtain  the  result. 
If  ls  >  0,  we  have  v(s)  >  q(s).  Then,  for  all  states  t  G  <5(s),  we  have  lt  < 
ls  —  1  and  v(t)  >  >  v(s).  By  inductive  hypothesis,  when  t  is  extracted, 

val[t\  =  v(t).  Now,  if  s  is  extracted  before  t,  count[s]  never  reached  0  and 
thus  the  priority  of  s  was  never  modified  from  its  initial  value  q(s).  Thus, 
Lemma  11  guarantees  that,  after  s  is  extracted,  all  elements  still  in  the  queue 
have  priority  at  most  q(s).  We  then  obtain  that,  when  t  is  finally  extracted, 
val[t\  <  q(s)  <  v(s)  <  v(t),  which  contradicts  the  inductive  hypothesis.  This 
proves  that  all  successors  of  s  are  extracted  before  s  itself.  Notice  that  v(s)  = 
atmmte8(s)v(t).  Then,  when  the  last  successor  of  s  leaves  the  queue,  it  assigns 
the  correct  value  v(s)  to  the  priority  of  s.  □ 


The  following  is  a  direct  consequence  of  Lemma  13. 

Lemma  14  [Correctness]  Let  val  =  ForallDiamond(5,  a,  q).  For  all  s  G  S, 
val[s ]  =  |VOcr](s). 

Lemma  15  [Complexity]  Algorithm  2  runs  in  time  O ( | <5|  +  |5|  log  151). 

Proof.  The  initialization  phase  requires  time  0(|S|  log  |5|).  In  each  iteration 
of  the  main  loop,  a  different  state  is  extracted  from  the  queue  and  its  incoming 
edges  are  considered.  An  optional  call  to  increase-key  is  made.  In  total,  every 
edge  in  the  LTS  is  considered  once  (time  O  ( |  <5 1 ) )  and  increase-key  is  called  at 
most  once  for  every  state  (time  0(|5|  log|S|)).  □ 
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3.1.3  Model  checking  VA.  Computing  |VAcr](s)  consists  in  minimizing 
the  (discounted)  average  [Acr]  over  the  paths  from  s.  As  observed  by  [23] 
for  the  non-discounted  case,  the  minimal  discounted  average  is  obtained  on  a 
path  p'  from  s  which,  after  some  prefix  p  keeps  repeating  some  simple  cycle  i. 
Hence  £  contains  at  most  \S\  states.  To  find  p' ,  we  use  two  steps.  In  the 
first  phase,  we  find  for  each  state  s  the  simple  cycle  i  starting  at  s  with  the 
minimal  discounted  average.  In  the  second  phase,  we  find  the  best  prefix-cycle 
combination  p£u. 


Phase  1.  We  need  to  compute 

La(s)  =  min{[Acr](p)  |  p  G  Traj s  and  p  =  (s0sis2  . . .  sn_i)w  and  n  <  [S'!}, 

where  the  value  |Acr](p)  is  given  by  •  X^o1  aK<l{si)-  Consider  for  n  >  0 
the  recursion 


v0 (s,  s')  =  0,  vn+1(s,  s')  =  q(s)  +  «•  nhn {vn(t,  s')  |  t  G  <5(s)}. 


Then  vn(s,  s’)  minimizes  XILo1  aK<l(si)  over  ad  finite  paths  So-Si  •  •  •  sn  with 
so  =  s  and  sn  —  s'.  Hence 


La  (<s) 


(1  —  a)-  min  j 


ups,  s) 
1-cG  ’ 


V2(s,s) 

1  —  Q2  ’ 


«|S|-i(s,s)  I 

1_Q,|S|-1  /  • 


For  a  fixed  state  s',  computing  min{u„(i,  s')  |  t  G  <5(s)}  for  all  sgS  can  be 
done  in  0(|h|)  time.  Therefore,  vn+\  is  obtained  from  vn  in  Od/SI2  +  |5'[*|<5|)  = 
0(151-1(51)  time.  Hence,  the  computation  of  U|5|_i  and  La  requires  0(|5|2-|(5|) 
time.  A  possible  implementation  of  this  phase  is  sketched  in  Algorithm  3, 
where  it  holds  that  La  =  LoopCost(5,  a,  q).  To  make  the  complexity  of 
the  algorithm  more  explicit,  the  transition  function  <5  is  treated  as  a  relation 
5  CSxS. 


Phase  2.  After  a  prefix  of  length  n,  the  cost  La(s)  of  repeating  a  cycle  at  state 
s  has  to  be  discounted  by  an,  which  is  exactly  the  factor  by  which  we  discount 
q(s)  after  taking  that  prefix.  Hence,  we  modify  the  original  LTS  S  into  an  LTS 
S+,  as  follows.  For  every  state  s  G  5,  we  add  a  copy  s  whose  weight  tc+(s) 
we  set  to  La(s)]  the  weights  tc+(s)  of  states  s  G  5  remain  q(s).  Moreover,  for 
every  t  G  5  and  s  G  5{t),  we  add  s  as  a  successor  to  t,  that  is,  <5+(f)  =  <5(f)U{s  | 
s  G  <5(f)}  and  <5+(s)  =  {s}.  Taking  the  transition  from  t  to  s  corresponds  to 
moving  to  s  and  repeating  the  optimal  cycle  from  there.  We  find  the  value  of 
the  optimal  prefix-cycle  combination  starting  from  s  as  the  discounted  distance 
from  s  to  5  =  {s  |  s  G  5}  in  the  modified  graph  5+  with  weights  w+.  Formally, 
given  an  LTS  S,  a  state  s,  a  weight  function  w:  S  — >  M-°,  a  discount  factor 
a,  and  a  target  set  T,  the  minimal  discounted  distance  from  s  to  T  is  d{s)  = 
minjX^o1  cd-w(sj)  |  s0Si . . .  sn_i  G  FTraj(s)  and  sn_i  G  T}.  The  value  of  d{s) 
for  s  G  5  is  computed  by  the  call  DiscountedDistance(5+,  w+ ,  a,  S)  to  the 
Algorithm  4,  which  is  a  discounted  version  of  the  Bellman-Ford  algorithm  for 
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Algorithm  3 

function  LoopCost(5,  a,  q) 
vars: 

Vi,  for  i  e  {0, . . . ,  | S' |  —  1}  :  (state  *  state)  array  of  rationals 
La  :  state  array  of  rationals 

init : 


for  each  s,  t  e  S  do 
v0[s,t]  :=  0 

done 

for  each  s,  t  e  S  and  i  G  {1, . . . ,  \S\  —  1}  do 
t]  :=  oo 

done 

main: 

for  i  :=  1  to  |5|  —  1  do 

for  each  edge  (s,  t)  G  5  do 
for  each  s'  G  S  do 

if  s']  then  vi[s,sr]  := 

done 

done 

done 

for  each  s,  s'  E  S  do 
:=  q(s)  + 

done 

for  each  s  G  S  do 


La[s]  :=  (1  -  a)  minj^I  ^ 

done 
return  La 


«|S|-l[s.s]l 

1_Q,|S|-1  / 


Algorithm  4 

function  DiscountedDistance(tS,  w,  a,  T) 
vars: 

d  :  state  array  of  rationals 

init : 

for  each  t  e  S  do 

if  t  G  T  then  d[t]  :=  w(t)  else  d[t]  :=  oo 

done 

main: 

for  i  :=  1  to  |5|  —  1  do 

for  each  s  G  S  and  s'  G  S(s)  do 

if  d[s]  >  w(s)  +  a  ■  d[s7]  then  d\s]  :=  w(s )  +  a  ■  d[s'] 

done 
done 
return  d 
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finding  shortest  paths.  Our  algorithm  performs  backward  computation  from 
the  set  T,  because  discounted  shortest  paths  (i.e.,  paths  whose  discounted 
distance  is  minimal  among  all  paths  with  the  same  first  and  last  state)  are 
closed  under  suffixes,  but  not  under  prefixes. 

Like  the  standard  version,  discounted  Bellman-Ford  runs  in  0(|S'|-|5|)  time. 
Thus,  the  complexity  of  computing  [VAcr]  is  dominated  by  the  first  phase. 

Lemma  16  [Correctness]  Letd  =  DiscountedDistance(5+,  w+,  a,  S),  where 
S+,w+  and  S  are  defined  in  the  previous  section.  For  all  s  G  S,  d[s|  = 
|VAcr](s). 

Lemma  17  [Complexity]  The  value  [VAcr]  can  be  computed  in  time  0(|S|J  • 

1*1). 

3.1.4  Complexity  of  Dctl  model  checking  over  LTSs.  The  overall 
complexity  of  model  checking  a  Dctl  formula  is  polynomial  in  the  size  of  the 
system  and  the  size  of  the  formula. 

Theorem  5  Consider  a  Dctl  formula  <fi,  an  LTS  S  =  ( S,S,P ,  [•]),  and  a 
parameter  interpretation  (•).  The  following  assertions  hold: 

(1)  The  problem  of  model  checking  <fi  over  S  with  respect  to  (•)  can  be  solved 
in  time  0(|S'|2-|5|-|(/>|). 

(2)  If  4>  does  not  contain  the  A  operator,  then  the  problem  of  model  checking 
f  over  S  with  respect  to  (•)  can  be  solved  in  time  0((|5|  +  \S\  log  \S\)  ■  |0|). 

3.2  Model  Checking  Dctl  over  Markov  Chains 

As  stated  by  Theorem  2,  the  path  and  fixpoint  semantics  over  Markov  chains 
coincide  for  the  formula  3Acr.  Hence,  in  Section  3.2.4  we  present  an  algorithm 
for  model  checking  this  formula  over  Markov  chains  in  either  semantics.  By 
contrast,  the  path  and  the  fixpoint  semantics  over  Markov  chains  may  differ  for 
the  formulas  30cr  and  VOcr.  Hence,  we  need  to  provide  algorithms  for  both 
semantics.  Because  of  the  absence  of  nondeterministic  choice  in  a  Markov 
chain,  [30cr]*  =  [VOcr]*  for  *  e  {f,p};  so  giving  algorithms  for  3AC  suffices. 
Section  3.2.1  gives  the  algorithm  for  model  checking  30cr  over  a  Markov 
chain  with  respect  to  the  path  semantics;  Section  3.2.3  treats  the  formula 
30cr  in  the  fixpoint  semantics.  In  the  following,  we  consider  a  fixed  Markov 
chain  (S,  t,  E,  [•])  and  its  probability  transition  matrix  P.  We  write  /  for  the 
identity  matrix. 

3.2.1  Model  checking  3A  in  the  path  semantics.  When  evaluating 
[30cr]p  in  a  state  s,  we  start  with  the  initial  estimate  q(s).  If  s  is  the  state 


25 


•‘>max  with  the  maximum  value  of  q,  the  initial  estimate  is  the  correct  value.  If  s 
has  the  second  largest  value  for  q,  the  estimate  can  only  be  improved  if  smax  is 
hit  within  a  certain  number  l  of  steps,  namely,  before  the  discount  a1  becomes 
smaller  than  ?.  This  argument  is  recursively  applied  to  all  states. 

Let  si, . . . ,  sn  be  an  ordering  of  the  states  in  S  such  that  q(s i)  >  q(s2)  >  •  •  •  > 
q(sn).  We  use  integers  as  matrix  indices,  thus  writing  P(i,j)  for  pSi,s..  For  all 
1  <  j  <  i  <  n,  let 


u.  .  - 

^1,3  ~ 


Liog„  Jifij 

if  q(si)  >  0 

0 

if  q(si)  =  0 

oo 

otherwise 

=  0 


Let  v(si)  =  [30ar]p(sj).  Then,  v(si)  =  q(s i),  and  we  can  express  the  value  of 
v(si)  in  terms  of  the  values  u(si), . . . ,  u(sj_i).  Let  K  =  max  {Ay.,-  |  kh]  <  cx)}, 
and  for  all  l  >  0,  let  B]  =  {sj  |  1  <  j  <  i  and  1  <  l  <  hj}.  Intuitively,  B\ 
contains  those  states  that,  if  hit  in  exactly  l  steps  from  Sj,  can  increase  the 
value  of  v(si). 

For  the  (arbitrary)  state  st,  the  following  holds: 


i — 1  ^i,j 

v(si)  =  q(si)  ■  stay1  +  ^  v(sj)  ■  al‘90jp  (9) 

j= i  i=i 


where  stay 1  =  Prs.  [A/>0  Zt  &  B\),  golj  t  =  Prs.  =  Sj  A  AL=i  ^  Blm\ ,  and 
the  random  variable  Z/  was  dehned  in  Section  2.3  as  the  state  of  the  markov 
chain  after  l  steps.  It  is  easy  to  check  that  stay 1  +  Yj)=i  Xa=i  9°j,i  =  1-  We 
proceed  in  two  phases.  The  first  phase  handles  states  st  with  q(si)  >  0.  Since 
the  sequence  {B])i>0  is  decreasing,  it  can  have  at  most  \S\  different  values. 
It  follows  that  there  exist  m  <  |5|  and  b\  <  ■■■  <  blm+1  G  IN  and  sets 
X[, ,  Xlm  C  S  such  that  b\  =  1,  b2m+1  =  K  +  1,  and  for  all  k  =  1 , ,  m 
and  all  b\  <  l  <  blk+l,  we  have  Bj  =  Xlk.  Let  Pk  be  the  substochastic  ma¬ 
trix  obtained  from  P  by  disabling  all  transitions  leading  to  states  in  Xk,  i.e., 
Pl(f,j)  =  0  for  all  j',j  with  Sj  G  Xlk.  Then,  for  given  bk  <  l  <  blk+1,  we  have 


go), i  =  ( (P('"2  Ul 


(■ P ■■■■■  (PUT1*-0*-1  •  (PkY-b%  -P) 


Let  m)  =  max{fc  j  Sj  G  Xk}  be  the  index  of  the  last  Xk  containing  Sj.  We 
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have 


b-  ■  m\  6?,-—  1 

1^1,]  3  k+ 1 

J2al-g°h  =  J2  2  al-g°h  = 

1=1  k=  1  l=b\ 

ml 

(E“4  •  (Pifi-'i  ■  ■■■■  (pl  i)1**-1**-  ■ 

k=l 

K+i-bl~l 

(  E  “'■«) 
1=0 

ml 

•  (f;)"V "i  •  (Piyi~bi  • . . .  •  (PU)bi<-'  ■ 

k= 1 

//-(aP^fwA' 

)-P)(i,J)- 

V  I  -  aPl 

Each  matrix  (Pl)h'k+1  ^  can  be  computed  by  repeated  squaring  in 

time  0(  5  3 

log  b\).  Some  further  calculations  show  that,  for  a  fixed  i,  both  Y.\=\  al'9°),i 

and  Ya=i  9°\i  can  be  computed  in  time  0(|5|4Tog  K).  The  value  stay1  is  given 
by  1  —  J2j,i  g°)  i ■  The  total  complexity  of  this  phase  is  thus  0(|5|5  ■  logit'). 

The  second  phase  considers  those  states  sl  with  q(sj)  =  0.  Let  u  be  the  smallest 
index  i  such  that  q{sj)  =  0.  Now,  golrj  l  is  the  probability  of  hitting  s3  after 
exactly  l  steps,  meanwhile  avoiding  all  states  with  indices  smaller  than  u.  To 
compute  v(si)  efficiently,  we  de&ie  a  stochastic  matrix  P$  from  P  by  adding 
an  absorbing  state  sn+i  and  using  sn+i  to  turn  all  states  s3  with  j  <  u  into 
transient  states  (so,  for  all  j  <  u,  P0(j,n  +  1)  =  1  and  P0{j,j')  =  0  for 
j' t £  n  +  1).  Also,  we  set  v  to  be  the  column  vector  with  v3  =  v(sj )  (computed 
in  phase  1),  if  j  <  u,  and  Vj  =  0  otherwise.  Then, 

u—1  oo 

v(si)  =  J2v(si)  ■  Jlal  ■  (Po)%j)  =  ((/-aP0)_1  -v)(i).  (10) 

5= i  i= i 

Solving  the  system  (10)  takes  time  O ( | <S| 3)  using  LUP  decomposition.  The 
time  spent  in  the  two  phases  amounts  to  CldS)5  ■  log  A'),  which  is  polynomial 
in  the  size  of  the  input. 

Lemma  18  The  value  [3<>cr]p  can  be  computed  in  time  0(|S|5  ■  log  A'). 


3.2.2  Alternative  algorithm  for  30  in  the  path  semantics.  We  can 

solve  the  system  (9)  using  an  alternative  recursion.  We  obtain  an  algorithm 
that  takes  time  0(|S’|3  ■  K).  As  K  can  be  exponential  in  the  number  of  bits 
used  to  encode  the  numerical  constants  in  the  system,  this  algorithm  is  only 
pseudo-polynomial.  However,  in  principle  this  algorithm  performs  better  than 
the  previous  one  when  K  <  |5|2  ■  log  A'.  We  outline  this  solution  below. 

The  main  step  in  the  algorithm  presented  in  the  previous  section  is  to  compute 
the  values  gol3  l  for  states  Si  for  which  q(si)  >  0.  For  l  >  0,  let  C\  be  the  event 
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“Zt  Bf” .  It  holds  that 

Wii  =  Pr Si[Zt  =  Sj  A  C\  A  . . .  A  C\_f\ 

=  PrSi[  Zi  =  sj  |  C[  A  . . .  A  •  PrSi[Q_1  IQA...A  Q_2]  •  . . .  •  PrSi[Q], 

For  each  j  =  1 , . . . ,  i  - 1  and  l  >  0,  let  p{s3 ,  /)  =  Prs .  [Zx  =  |  Am=i  Z™  B*m] . 

In  words,  p(sj,l )  is  the  probability  that,  starting  in  sx,  the  system  reaches  Sj 
after  exactly  /  steps,  given  that  in  each  previous  step  it  does  not  hit  states 
that  can  influence  v(si).  For  all  j  =  1, . . .  ,n  and  0  <  l  <  K,  we  can  com¬ 
pute  Pr s.[Zi  =  Sj  j  C\  A  . . .  A  C\_ j ]  together  with  p(sj,l )  using  the  following 
recursion: 

p{sj,l)  =  P(i,j ) 

PrSi[Ci]  =  Y/{p(st,l)\st?Bi1} 

P(S”l+1)  =  Pr„[C/|CiA...AC?_1]  •  E{P(t^')  I  t  Bn 

PrSi[C*+1  1  Cj  A  . . .  A  C?]  =  £>(*,  Z)  |  ^  ^+1} 

For  a  fixed  i,  the  previous  recursion  takes  time  0(\S\2  ■  K).  Then, 

i-i 

9°\i  =  p(sj,  l)  n  PrSi 

m=  1 

It  follows  that,  for  a  fixed  i,  all  values  gofx  can  be  computed  in  time  0(\S\2-K). 
The  total  complexity  is  thus  O ( | 3  •  K).  For  states  sl  such  that  q(si)  =  0,  we 
again  solve  the  system  (10)  using  LUP  decomposition.  Overall,  this  gives  an 
algorithm  that  runs  in  O ( | S') 3  •  K). 

Lemma  19  The  value  [3<>cr]p  can  be  computed  in  time  0(|S|3  •  K ). 

3.2.3  Model  checking  30  in  the  fixpoint  semantics.  The  value  [30cr]f 
on  a  MC  can  be  computed  by  transforming  the  fixpoint  (1)  into  a  linear- 
programming  problem,  following  a  standard  approach.  Expanding  the  defini¬ 
tion  of  (1)  for  MCs,  we  have  that  [30cr]f  is  the  unique  fixpoint  of  the  following 
equation  in  v  :  S  1— 1 >  1R:  for  all  s  G  S, 

v{s )  =  q(s)  Ua.^ v(t)-ps,t.  (11) 

tes 

The  following  lemma  enables  us  to  compute  this  fixpoint  via  linear  program¬ 
ming. 

Lemma  20  Consider  the  following  linear-programming  problem  in  the  set 
(u(s)  |  s  G  S'}  of  variables:  minimize  J2S esv(s)  subject  to 

v{s )  >  q(s)  v(s)  >a~Y  v{t)-pa,t 

tes 


s~n 


ClA-.-AC^ 
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for  all  s  G  S.  Let  v  G  Vs  be  an  optimal  solution,  we  have  v  =  [30cr]f. 


The  above  linear  programming  problem  can  be  solved  in  time  polynomial  in 
|<S|b  and  |a|b- 

3.2.4  Model  checking  VA  in  both  semantics.  Formulas  of  the  type 
VA cr  can  be  evaluated  by  the  following  classical  equation  [12]. 

Lemma  21  Let  [3 Ayr]  and  q  denote  column  vectors,  we  have 

[3Acr]  =  (1  -  a)  •  £  alPlq  =  (1  -«)■(/-  a?)'1  ■  q. 

i>  0 


Thus,  we  can  compute  the  value  |3Acr](s)  for  each  state  s  G  S'  by  solving  a 
linear  system  with  \S\  variables.  This  takes  time  0(|S|log27)  using  Strassen’s 
algorithm  or  0(|S|3)  using  LUP  decomposition. 

3.2.5  Complexity  of  Dctl  model  checking  over  Markov  chains.  The 

overall  complexity  is  polynomial  in  the  size  of  the  system,  and  exponential  in 
the  size  of  the  formula.  The  latter  exponential  complexity  is  due  to  the  fact 
that  the  number  of  arithmetic  operations  is  polynomial  in  the  size  of  the  bit¬ 
wise  encoding  of  the  valuations,  and  these  encodings  grow  exponentially  with 
respect  to  the  number  of  nestings  of  temporal  operators. 

Theorem  6  Given  a  Dctl  formula  f>,  a  Markov  chain  S  =  ( S,r,P ,  [•]), 
and  a  parameter  interpretation  (■),  the  problem  of  model  checking  (f)  over  S 
with  respect  to  (•)  can  be  solved  in  time  polynomial  in  |S'|,  | [•] | &,  and  |(-)|fe,  and 
exponential  in  \<f> |. 


3.3  Model  Checking  Dctl  over  Markov  Decision  Processes 

As  it  is  the  case  for  Markov  chains,  also  for  MDPs  the  path  and  fixpoint 
semantics  do  not  coincide  for  the  formulas  30cr  and  VOcr,  so  that  sepa¬ 
rate  algorithms  are  needed.  The  two  semantics  do  coincide  for  the  formula 
VA cr  on  MDPs,  hence  one  algorithm  suffices.  We  consider  a  fixed  MDP 
S  =  (S,  r,E,  [•]). 

3.3.1  Model  checking  30  and  VO  in  the  path  semantics.  If  a  =  0, 

then  trivially  |30cr]p(s)  =  [VOcr]p(s)  =  q(s)  at  all  s  G  S',  so  in  the  following 
we  assume  0  <  a  <  1.  The  problem  of  computing  [30cr]p  on  an  MDP  can 
be  viewed  as  an  optimization  problem,  where  the  goal  is  to  maximize  the 
expected  value  of  the  sup  of  q  over  a  path.  As  a  preliminary  step  to  solve  the 
problem,  we  note  that  in  general  the  optimal  strategy  is  history  dependent, 
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9=1 
9  =  0 

q  =  0.8 

Fig.  3.  An  MDP  requiring  a  memory  strategy  for  [3^c?’]p(s). 

that  is,  the  choice  of  distribution  at  a  state  depends  in  general  on  the  past 
sequence  of  states  visited  by  the  path. 

Example  2  Consider  the  system  depicted  in  Figure  3  and  assume  a  —  1. 
The  optimal  choice  in  state  s 2  depends  on  whether  t\  was  hit  or  not.  If  it  was, 
the  current  sup  is  0.8  and  the  best  choice  is  01,  because  with  probability  |  the 
sup  will  increase  to  1.  If  t\  was  not  hit,  the  best  choice  is  02,  because  it  gives 
a  certain  gain  of  0.8,  rather  than  an  expected  gain  of  0.5.  The  same  argument 
holds  if  a  is  sufficiently  close  to  1. 

While  the  above  example  indicates  that  the  optimal  strategy  is  in  general 
history-dependent,  it  also  suggests  that  all  a  strategy  needs  to  remember  is 
the  maximum  value  that  has  occurred  so  far  along  the  path.  For  s  G  S  and 
x  G  m,  we  define 


h3(s,  x )  =  sup  x  U  sup  alq(Zi) 

7tGI1  L  i>0 

hy(s,x)  =  inf  E^  x  U  sup  alq(Zi)  . 

^en  L  i>o 

Obviously,  we  have  [30cr]p(s)  =  h3(s,  0)  and  [VOcr]p(s)  =  /?v(s,  0).  Note 
that  the  type  of  h3  and  hv  is  S  x  1R  1— >  IR.  To  compute  these  quantities,  we 
define  two  operators  H3,  Hy  :  (S  x  IR  1— >  1R)  1— »•  (S  x  1R ,  1— >•  ]R)  as  follows,  for 
all  v  :  S  x  IR o— >  H,  s  G  S,  and  x  G  IR: 

!x  if  x  >  1 

a-  max  ]T  v (t ,  xUq^\  .a(t)  otherwise 

{x  if  x  >  1 

a-  min  £  v  ( t,  ,a(t)  otherwise 

a£T^tes  '  a  ' 

Intuitively,  the  equation  (12)  can  be  understood  as  follows.  At  a  state  s  = 
sm  of  a  path  s0Si . . .,  the  quantity  v(sm,x )  represents  the  maximum  over  all 
strategies  of  ESm[supi>0  alq(Zi)]  given  that  max0<j<m  a~’lq(sm_i)  =  x.  The 
recursion  (12)  then  relates  v(s,  x)  to  v(t,  y)  at  the  successors  t  of  s,  where  at  t 
we  consider  the  new  conditioning  y  =  (x  U  q(s))/a,  thus  discounting  x  U  q(s) 
by  a-1  (as  s  is  one  step  before  t ).  The  following  lemma  states  that  h 3  and  hv 
are  the  unique  hxpoints  of  H 3  and  i/v,  respectively. 


(12) 

(13) 
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Lemma  22  h 3  and  h v  are  the  unique  fixpoints  of  H3  and  Hy . 

Proof.  It  is  easy  to  see  that  the  operators  H 3  and  Hy  admit  a  unique  fixpoint, 
as  they  are  a-contractions.  We  show  that  h3  is  a  fixpoint  of  H 3;  the  case  for 
hv  and  Hy  is  analogous:  we  show  thus  that  H3(h3)(s,  x)  =  h3(s,x),  for  all 
s  6  S  and  x  G  1R.  First,  note  that  for  x  >  1  we  have  h3(s,x )  =  x,  as  the 
expectation  of  sup i>0a'tq(Zi)  can  be  no  larger  than  1.  For  x  <  1,  we  have: 

H3(h3)(s,  x)  =  a ■  max  a(t)-h3(t,  —  — 

aGr(s)  feS  \  a  J 

_ ^  I _ |  q(s') 

=  a ■  max  ^  a(t)-  sup  - U  sup  alq{Zf) 

aGrR)  tes  ^ren  L  a  i>  o 


=  max  ^2  a(t)-  sup  x  U  q(s)  U  sup  al+1q(Zi ) 

CL(zlt(s)  7rGn 

=  sup  E^  x  U  sup  alq{Zf)  =  h3(s,  x). 

7rGll  i>0 

□ 

Since  we  are  ultimately  interested  in  the  value  of  h3(s,  0)  for  s  G  S,  and  since 
if  x  >  1  we  have  E^  x  U  supi>0  alq(Zi)  =  x  for  all  s  G  S  and  7r  G  II,  it  suffices 
to  consider  values  for  x  that  belong  to  the  finite  set 

X  =  {q(s)/ak  |sGSA^g!NA  q(s)/ak  <  1}. 

To  estimate  the  cardinality  of  A",  consider  any  state  s :  if  q(s)  G  {0}  U  [ct,  1), 
then  s  has  only  one  representative  in  X,  namely  q(s).  If  q(s)  =  1  then  s 
has  no  representative  at  all  in  X.  Finally,  if  q(s)  G  (0,a),  s  has  ks  repre¬ 
sentatives  q(s),  q(s)/a,  q(s)/a2, . . . ,  n(,s)/a:fcs-1,  where  ks  =  flog^ofs)].  Thus, 
let  Y  =  {q(s)  |  s  G  S  A  q(s)  G  (0,a)};  if  Y  =  0,  \X\  <  |S|;  otherwise, 
|X|  <  |S,|-floga(miny)l. 

The  fixpoints  of  H3  and  Hy  can  be  computed  via  linear  programming,  follow¬ 
ing  a  standard  approach,  enabling  us  to  compute  the  path  semantics  of  30 
and  VO  in  MDPs. 

Lemma  23  The  following  assertions  hold: 

(1)  Consider  the  following  linear  program  in  the  set  {i>(s,  x)  |  s  G  G  X} 
of  variables:  minimize  J2sesJ2xexv(six)  subject  to 

^  /  x  U  q(s)  \  ,  . 

v(s,  x)  >  a  ■  yfvlt, - J  -aft) 

J.S-  Q  \  Oi  / 
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for  all  s  G  S,  all  x  G  X,  and  all  a  G  r(s),  where  v(t,  x)  is  1  if  x  >  1,  and 
is  v(t,x)  otherwise.  Denoting  by  {v(s,x)  \  s  G  S  A  x  G  X}  an  optimal 
solution,  we  have  [30cr]p(s)  =  v(s,q(s))  for  all  s  G  S. 

(2)  Consider  the  following  linear  program  in  the  set  {v(s,  x)  |  s  G  S'  Ax  G  X} 
of  variables:  maximize  J2s(zsJ2x exv(s,x)  subject  to 

^  /  x  U  q(s)  \  .  . 

v(s,  x)  <  a  •  t, - j -aft) 

teS  V  a  J 

for  all  s  G  S,  all  x  G  X ,  and  all  a  G  r(s),  where  v(t,  x)  is  1  if  x  >  1,  and 
is  v(t,x )  otherwise.  Denoting  by  {v(s,x)  \  s  G  S  A  x  G  X}  an  optimal 
solution,  we  have  [VOcr]p(s)  =  v(s,q(s))  for  all  s  G  S. 

The  linear  programming  problems  in  the  above  theorem  contain  at  most  2-  \S\  ■ 
\X\  variables.  Hence,  if  q- values  are  encoded  in  binary  notation,  the  number 
of  variables  in  the  encoding  is  linear  in  the  size  of  the  input  encoding  of  the 
MDP. 


3.3.2  Model  checking  30  and  VO  in  the  fixpoint  semantics.  The 

computation  of  [30cr]f  on  an  MDP  can  be  performed  by  transforming  the  fix- 
point  (1)  into  a  linear- programming  problem,  following  a  standard  approach. 
Expanding  the  definition  of  (1),  we  have  that  [30cr]f  is  the  unique  fixpoint 
of  the  following  equation  in  v  G  Vs-  for  all  s  G  S, 

v(s )  =  q(s)  U  a ■  max  V'n(t)-a(t).  (14) 

aGr(s)  t&s 

The  following  theorem  enables  us  to  compute  this  fixpoint  via  linear  program¬ 
ming. 

Lemma  24  Consider  the  following  linear-programming  problem  in  the  set 
(u(s)  |  s  G  S}  of  variables:  minimize  J2sesv(s)  subject  to 

v(s)  >  q(s)  v(s)  >  a-yfv(t)-a(t) 

tes 

for  all  s  G  S  and  all  a  G  r(s).  Denoting  by  {n(s)  |  s  G  S}  an  optimal  solution, 
we  have  [30cr]f  =  v. 

The  above  reduction  to  linear  programming  yields  an  algorithm  for  [30cr]f 
that  requires  time  polynomial  in  |<S|&  and  \a\b.  The  computation  of  [VOcr]f, 
on  the  other  hand,  is  not  known  to  be  reducible  in  this  fashion  to  linear 
programming,  and  as  a  consequence,  we  are  only  able  to  provide  an  algorithm 
that  is  in  nondeterministic  polynomial  time  in  \S\b  and  \ot\b- 
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Define  the  two  operators  L7,  L  :  Vs  1— >•  Vs,  where  7  G  T,  as  follows,  for  all 
v  G  Vs  and  s  G  S: 


L7(n)(s)  =  ?(s)Ua-^i;(f)-7(s)(f) 
tes 


L(v)(s)  =  q(s)  U  a-  min 

a£r(s) 


tes 


Comparing  the  definition  of  L  with  (2),  we  have  that  [VOcr]f  =  fw.Lv.  Unfor¬ 
tunately,  while  (14)  consisted  only  of  max-operators,  the  operator  L  contains 
a  mixture  of  max  and  min,  and  it  is  not  known  how  to  reduce  its  computation 
to  the  solution  of  a  single  linear  programming  problem. 

The  fixpoint  of  L  can  be  computed  using  a  standard  policy-improvement 
scheme  [2],  A  policy  is  a  mapping  7  :  S  1— >  Distr(S')  such  that  7(5)  G  r(s) 
for  all  s  G  S';  we  denote  by  T  the  set  of  all  policies.  For  a  fixed  policy  7,  the 
operator  F7  involves  only  max,  and  its  fixpoint  can  be  computed  by  linear 
programming. 


Lemma  25  For  7  G  T,  the  fixpoint  pv.L7v  coincides  with  the  optimal  solution 
of  the  linear  programming  problem  in  v  G  Vs  that  asks  to  minimize  J2sesv(s) 
subject  to  v(s )  >  q(s )  and  v(s)  >  a ■  Y^tesvif)'lk{s){t)  for  all  s  G  S. 

For  7  G  T,  we  denote  the  fixpoint  of  F7  by  u7  =  pv.L^v.  To  obtain  a  policy 
iteration  scheme,  we  define  the  policy  improvement  operator  F  :  T  i->  T  as 
follows,  for  all  7  G  T: 


H(j)(s)  =  arg  min  Vr7(f)-a(f). 

aer(s)  t&s 

We  construct  a  sequence  of  policies  70,71,72,...  by  letting  70  be  arbitrary, 
and  for  k  >  0,  by  letting  7^+1  =  H( 7*,).  The  convergence  of  this  sequence 
follows  from  the  fact  that  there  are  only  finitely  many  policies,  and  from  the 
following  lemma,  which  prevents  cycles  in  the  sequence. 

Lemma  26  For  any  70  G  T,  let  7^+1  =  H( 7*,)  for  k  >  0.  We  have  that 
vlk+l  <  vlk  for  all  k  >  0. 

Proof.  The  result  is  a  consequence  of  the  fact  that  vlk+1  =  lim,,^^  L™k+i  (L(v7k)), 
and  of  the  fact  that  L7fc+1  and  L  are  monotonic,  with  respect  to  the  pointwise 
ordering  of  Vs-  □ 


The  following  lemma  then  enables  the  computation  of  [VOcr]f. 

Lemma  27  Let  7  G  V  be  arbitrary,  and  letfi  =  \\m^00Hk{py).  Then,  [VOcr]f  = 
v7. 
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Proof.  First,  note  that  the  sequence  {Hk(,T)}k>o  converges,  by  Lemma  26. 
Second,  note  that  since  H{ 7)  =  7,  it  must  be  also  L(vg)  =  Lj(vg)  =  vg,  so 
that  Vj  is  a  hxpoint  of  L.  □ 


The  set  V  is  of  size  exponential  in  J2ses  |t(s)|,  and  this  type  of  policy  iteration 
is  not  known  to  terminate  in  polynomial  time.  However,  the  problem  can  be 
solved  in  NPTIME  in  |<S|b  by  guessing  an  optimal  policy. 

Lemma  28  The  value  [VOcr]f  can  be  computed  in  NPTIME  in  |«S|&. 

Proof.  To  compute  [VOcr]f,  we  can  guess  7  G  T  and  check  that  7  =  H( 7); 
we  have  then  that  [VCyr]f  =  nv.L^v.  All  the  required  computation  can  be 
performed  via  linear  programming.  □ 


3.3.3  Model  checking  VA  in  both  semantics.  With  the  two  semantics 
for  VA cr  coinciding,  a  single  algorithm  suffices  for  model  checking  VA  in  both 
semantics.  The  hxpoint  semantics  of  this  formula  immediately  suggests  an 
algorithm  based  on  standard  methods  used  for  discounted  long-run  average 
problems  [2].  Expanding  the  definition  (6),  we  have  that  [VAcr]  is  the  unique 
hxpoint  of  the  following  equations  in  v  E  V5:  for  all  s  E  S, 

v(s)  =  (1  —  a)-q(s)  +  min  V'u(t)a(t). 

aer(s)  t&s 

The  hxpoint  can  be  easily  computed  by  linear  programming,  again  following 
a  standard  approach  [2] . 

Lemma  29  Consider  the  following  linear-programming  problem  in  the  set 
(u(s)  |  s  E  S'}  of  variables:  maximize  J2sesv(s)  subject  to 

v(s)  <  (1  —  a)-q(s)  +  a-  ^  v(t)-a(t) 

tes 

for  all  s  E  S  and  all  a  E  r(s).  Denoting  by  {h(s)  \  s  E  S}  an  optimal  solution, 
we  have  [VAcr]  =  v*. 

3.3.4  Complexity  of  Dctl  model  checking  over  MDPs.  The  complex¬ 
ity  of  the  model-checking  problem  for  Dctl  formulas  in  MDPs  is  summarized 
by  the  following  theorem. 

Theorem  7  Given  a  Dctl  formula  <f>,  an  MDP  S  =  ( S,t,P ,  [•]),  and  a 
parameter  interpretation  (■),  the  following  assertions  hold: 

(1)  The  problem  of  computing  [0]p  over  S  with  respect  to  (•)  can  be  solved  in 
time  polynomial  in  |«S|&  and  | (•)!&,  and  exponential  in  \<f>\. 
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(2)  The  problem  of  computing  [0]f  over  S  with  respect  to  (•)  can  be  solved 
in  nondeterministic  polynomial  time  in  |«S|&  and  | (-)  |&,  and  exponential  in 

\4>\- 


The  first  part  of  the  theorem  follows  from  Theorems  23  and  29;  the  second  part 
follows  from  Theorems  24,  28,  and  29.  Note  that  the  algorithms  presented  for 
solving  model-checking  problem  for  Dctl  in  MDPs  have  different  complexities 
for  the  path  and  fixpoint  semantics.  This  contrasts  with  the  situation  for 
transition  systems  and  Markov  chains,  where  we  have  presented  algorithms  for 
Dctl  model-checking  that  are  of  polynomial  time-complexity  with  respect  to 
the  size  of  the  system  for  both  the  path  and  the  fixpoint  semantics.  As  in  the 
case  of  Markov  chains,  also  in  MDPs  the  complexity  of  the  model-checking 
problem  is  exponential  in  the  size  of  the  Dctl  formula,  due  to  the  blow-up  of 
the  binary  representations  of  subformula  valuations. 


4  Conclusions 

The  traditional  theories  of  discrete  transition  systems  are  boolean:  the  value 
of  a  proposition  at  a  state  is  boolean,  and  the  value  of  a  temporal  property  at 
a  state  is  boolean.  In  boolean  theories,  property  values  are  sensitive  to  small 
perturbations  of  a  system:  if  the  value  of  a  proposition  at  a  single  state  s  is 
switched,  then  the  value  of  a  temporal  property  may  switch  at  an  arbitrary 
distance  from  s.  This  is  problematic,  first,  because  there  may  be  imprecision 
in  models,  and  second,  because  engineering  artifacts  that  are  based  on  boolean 
models  are  equally  fragile. 

We  built  a  continuous  theory  of  discrete  transition  systems  by  systematically 
replacing  boolean  values  with  real  values:  the  value  of  a  proposition  at  a  state 
is  a  real,  and  so  is  the  value  of  a  temporal  property  at  a  state.  In  a  systems 
theory  based  on  the  reals,  it  is  natural  to  introduce  discounting  over  time, 
and  probabilities  over  transitions.  We  achieved  continuity  in  the  sense  that 
small  perturbations  of  the  reals  that  specify  a  system  lead  to  small  changes  in 
the  values  of  discounted  temporal  properties.  The  resulting  theory  is  therefore 
robust  against  imprecisions  in  measurement  and  implementation. 

We  showed  that  over  probabilistic  systems,  the  standard  temporal  operators 
can  be  given  two  different  natural,  continuous  interpretations:  a  path  seman¬ 
tics  and  a  fixpoint  semantics.  The  fixpoint  semantics  corresponds  to  a  contin¬ 
uous  generalization  of  state  bisimilarity  [10],  while  no  such  characterization  is 
known  for  the  path  semantics.  On  the  other  hand,  the  path  semantics  gives 
a  natural  limit  interpretation  to  infinite  behaviors  of  a  system.  We  presented 
model-checking  algorithms  for  both  semantics,  but  the  question  whether  the 
fixpoint  semantics  of  VO  properties  over  MDPs  can  be  computed  in  polynomial 
time  remains  open. 
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